SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 2 days ago • 70
SmolVLM 256M & 500M Collection Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 14 days ago • 65
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published 3 days ago • 34
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published 4 days ago • 152
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models Paper • 2502.01142 • Published 3 days ago • 15
PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models Paper • 2502.01584 • Published 3 days ago • 7
A Study on the Performance of U-Net Modifications in Retroperitoneal Tumor Segmentation Paper • 2502.00314 • Published 6 days ago • 3
view article Article LLM Dataset Formats 101: A No‐BS Guide for Hugging Face Devs By tegridydev • 6 days ago • 4
WildChat-50m Collection All model responses associated with the WildChat-50m paper. • 55 items • Updated 8 days ago • 6
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain • 8 days ago • 23