SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 1 day ago • 37
lewtun/details_deepseek-ai__DeepSeek-R1-Distill-Qwen-1.5B Viewer • Updated about 5 hours ago • 1k • 8
lewtun/details_deepseek-ai__DeepSeek-R1-Distill-Qwen-1.5B Viewer • Updated about 5 hours ago • 1k • 8
lewtun/details_deepseek-ai__DeepSeek-R1-Distill-Qwen-7B Viewer • Updated about 22 hours ago • 597 • 4
lewtun/details_deepseek-ai__DeepSeek-R1-Distill-Llama-8B Viewer • Updated about 23 hours ago • 598 • 7
lewtun/details_deepseek-ai__DeepSeek-R1-Distill-Qwen-7B Viewer • Updated about 22 hours ago • 597 • 4
lewtun/details_deepseek-ai__DeepSeek-R1-Distill-Qwen-7B Viewer • Updated about 22 hours ago • 597 • 4
lewtun/details_deepseek-ai__DeepSeek-R1-Distill-Llama-8B Viewer • Updated about 23 hours ago • 598 • 7
lewtun/details_deepseek-ai__DeepSeek-R1-Distill-Llama-8B Viewer • Updated about 23 hours ago • 598 • 7