SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published 1 day ago β’ 32
view article Article SmolVLM Grows Smaller β Introducing the 250M & 500M Models! 14 days ago β’ 119
view article Article Fine-tune a SmolLM on domain-specific synthetic data from a LLM By davidberenstein1957 β’ Jan 3 β’ 32
SmolVLM Collection State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct β’ 5 items β’ Updated Dec 22, 2024 β’ 32
π» Local SmolLMs Collection SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos β’ 14 items β’ Updated Dec 22, 2024 β’ 48
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper β’ 2406.17557 β’ Published Jun 25, 2024 β’ 91
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper β’ 2405.18392 β’ Published May 28, 2024 β’ 12
Leaderboards and benchmarks β¨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... β’ 90 items β’ Updated 1 day ago β’ 93
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community β’ 17 items β’ Updated Jun 6, 2024 β’ 233
StarCoder 2 and The Stack v2: The Next Generation Paper β’ 2402.19173 β’ Published Feb 29, 2024 β’ 137
π« StarCoder2 Collection StarCoder2 models and datasets! β’ 8 items β’ Updated Mar 1, 2024 β’ 83