Generating Multi-Image Synthetic Data for Text-to-Image Customization Paper • 2502.01720 • Published 3 days ago • 4
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search Paper • 2502.02508 • Published 2 days ago • 16
MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models Paper • 2502.00698 • Published 4 days ago • 21
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper • 2502.01341 • Published 3 days ago • 31
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published 4 days ago • 152
GME: Improving Universal Multimodal Retrieval by Multimodal LLMs Paper • 2412.16855 • Published Dec 22, 2024 • 2
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published 7 days ago • 49
GuardReasoner: Towards Reasoning-based LLM Safeguards Paper • 2501.18492 • Published 7 days ago • 78
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published 8 days ago • 50
Atla Selene Mini: A General Purpose Evaluation Model Paper • 2501.17195 • Published 10 days ago • 30
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published 9 days ago • 32
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 9 days ago • 100
Towards General-Purpose Model-Free Reinforcement Learning Paper • 2501.16142 • Published 10 days ago • 24
RL + Transformer = A General-Purpose Problem Solver Paper • 2501.14176 • Published 14 days ago • 22