SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 1 day ago • 32
Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect Paper • 2409.17912 • Published Sep 26, 2024 • 25
view article Article Yay! Organizations can now publish blog Articles By huggingface and 3 others • 17 days ago • 30
view article Article TerjamaBench: A Cultural Benchmark for English-Darija Machine Translation By imomayiz and 4 others • 27 days ago • 27
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated about 1 hour ago • 213
📀 Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12, 2024 • 37
view article Article Finding Moroccan Arabic (Darija) in Fineweb 2 By omarkamali and 3 others • Dec 8, 2024 • 21
Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper • 2410.05355 • Published Oct 7, 2024 • 33
Searching for Better ViT Baselines Collection Exploring ViT hparams and model shapes for the GPU poor (between tiny and base). • 25 items • Updated Aug 21, 2024 • 16
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation Apr 29, 2024 • 76
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22, 2024 • 127
T2I-Adapter-SDXL Collection The smallest and most efficient control models for SDXL! • 8 items • Updated Sep 8, 2023 • 32
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 244