SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published 7 days ago β’ 153
Jan 17 Releases βοΈ Collection Models and datasets of the second week of Jan 2025. β’ 23 items β’ Updated 25 days ago β’ 10
GGUF LoRA adapters Collection Adapters extracted from fine tuned models, using mergekit-extract-lora β’ 16 items β’ Updated 19 days ago β’ 3
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M β’ 16 items β’ Updated 5 days ago β’ 227
view article Article Decoding Strategies in Large Language Models By mlabonne β’ Oct 29, 2024 β’ 40
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi β’ 13 items β’ Updated Sep 18, 2024 β’ 227
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper β’ 2408.10914 β’ Published Aug 20, 2024 β’ 42