Can LLMs Maintain Fundamental Abilities under KV Cache Compression? Paper • 2502.01941 • Published 3 days ago • 9
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published 3 days ago • 105
Reward-Guided Speculative Decoding for Efficient LLM Reasoning Paper • 2501.19324 • Published 6 days ago • 32
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 9 days ago • 100
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published 9 days ago • 32
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 15 days ago • 301
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation Paper • 2501.12202 • Published 16 days ago • 33
GameFactory: Creating New Games with Generative Interactive Videos Paper • 2501.08325 • Published 23 days ago • 61
MangaNinja: Line Art Colorization with Precise Reference Following Paper • 2501.08332 • Published 23 days ago • 56
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 23 days ago • 272
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 28 days ago • 87