view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 โข 564
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M โข 16 items โข Updated about 2 hours ago โข 213
The Big Benchmarks Collection Collection Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) โข 13 items โข Updated Nov 18, 2024 โข 193
Open LLM Leaderboard best models โค๏ธโ๐ฅ Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: โข 63 items โข Updated 3 days ago โข 524
view article Article Democratization of AI, Open Source, and AI Auditing: Thoughts from the DisinfoCon Panel in Berlin By frimelle โข Oct 8, 2024 โข 6
Manual Configuration Collection 5 datasets showcase YAML configuration on HuggingFace. See docs: https://huggingface.co/docs/hub/datasets-manual-configuration. โข 5 items โข Updated Nov 23, 2023 โข 5
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper โข 2406.17557 โข Published Jun 25, 2024 โข 91
๐ MINT-1T Collection Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" โข 13 items โข Updated Jul 24, 2024 โข 58
view article Article Announcing Finance Commons and the Bad Data Toolbox: Pioneering Open Data and Advanced Document Processing By Pclanglais โข Jul 19, 2024 โข 20
view article Article Experimenting with Automatic PII Detection on the Hub using Presidio Jul 10, 2024 โข 24
ParaNames 1.0: Creating an Entity Name Corpus for 400+ Languages using Wikidata Paper โข 2405.09496 โข Published May 15, 2024 โข 3