SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 1 day ago • 37
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 15 days ago • 298
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 23 days ago • 53
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 125
view article Article A failed experiment: Infini-Attention, and why we should keep trying? Aug 14, 2024 • 57
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context Jul 23, 2024 • 226
view article Article Docmatix - a huge dataset for Document Visual Question Answering Jul 18, 2024 • 72