Improving Transformer World Models for Data-Efficient RL Paper β’ 2502.01591 β’ Published 3 days ago β’ 8
Reasoning Datasets Collection Distilled synthetic Reasoning datasets β’ 7 items β’ Updated 4 days ago β’ 45
view article Article SmolVLM Grows Smaller β Introducing the 250M & 500M Models! 15 days ago β’ 119
Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments Paper β’ 2501.10893 β’ Published 19 days ago β’ 23
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper β’ 2501.04519 β’ Published 29 days ago β’ 253
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper β’ 2501.03262 β’ Published Jan 4 β’ 90
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper β’ 2402.03300 β’ Published Feb 5, 2024 β’ 88
view article Article Fine-tune ModernBERT for text classification using synthetic data By davidberenstein1957 β’ Dec 30, 2024 β’ 31
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search Paper β’ 2412.18319 β’ Published Dec 24, 2024 β’ 37
DeTikZify Collection Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ β’ 11 items β’ Updated Dec 4, 2024 β’ 7
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper β’ 2412.06559 β’ Published Dec 9, 2024 β’ 79
view article Article Rethinking Backpropagation: Thoughts on What's Wrong with Backpropagation By Jaward β’ Dec 2, 2024 β’ 5
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper β’ 2412.05271 β’ Published Dec 6, 2024 β’ 129
view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs By wolfram β’ Dec 4, 2024 β’ 76
Cut Your Losses in Large-Vocabulary Language Models Paper β’ 2411.09009 β’ Published Nov 13, 2024 β’ 45
Thinking LLMs: General Instruction Following with Thought Generation Paper β’ 2410.10630 β’ Published Oct 14, 2024 β’ 18
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais and 2 others β’ Nov 13, 2024 β’ 98