Aurélien-Morgan CLAUDON's picture

Aurélien-Morgan CLAUDON

Aurelien-Morgan

·

https://huggingface.co/retrain-pipelines

AI & ML interests

None yet

Recent Activity

upvoted a paper about 5 hours ago

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

liked a Space about 23 hours ago

deepseek-ai/deepseek-vl2-small

upvoted a paper 3 days ago

s1: Simple test-time scaling

View all activity

Organizations

Aurelien-Morgan's activity

upvoted a paper about 5 hours ago

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Paper • 2501.16975 • Published 9 days ago • 23

upvoted a paper 3 days ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published 6 days ago • 88

upvoted a paper 6 days ago

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Paper • 2501.18512 • Published 7 days ago • 25

upvoted 2 articles 14 days ago

Article

Mastering Long Contexts in LLMs with KVPress

By

and 1 other •

14 days ago

• 59

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

15 days ago

• 119

upvoted an article 16 days ago

Article

Fine-tune ModernBERT for RAG with Synthetic Data

By

and 2 others •

17 days ago

• 33

upvoted a paper 21 days ago

Titans: Learning to Memorize at Test Time

Paper • 2501.00663 • Published Dec 31, 2024 • 14

upvoted a paper 22 days ago

Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI

Paper • 2409.14160 • Published Sep 21, 2024 • 2

upvoted a collection about 1 month ago

OLMo 2

Artifacts for the second set of OLMo models. • 22 items • Updated about 1 month ago • 81

upvoted a paper about 2 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 345

upvoted 2 papers 2 months ago

Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving

Paper • 2407.00079 • Published Jun 24, 2024 • 5

GRAPE: Generalizing Robot Policy via Preference Alignment

Paper • 2411.19309 • Published Nov 28, 2024 • 44

upvoted an article 2 months ago

Article

Let’s make a generation of amazing image generation models

By

and 4 others •

Nov 26, 2024

• 34

upvoted a collection 3 months ago

🚀 Trending Demo

13 items • Updated Dec 24, 2024 • 9

upvoted 3 papers 3 months ago

The Surprising Effectiveness of Test-Time Training for Abstract Reasoning

Paper • 2411.07279 • Published Nov 11, 2024 • 3

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 115

SALSA: Soup-based Alignment Learning for Stronger Adaptation in RLHF

Paper • 2411.01798 • Published Nov 4, 2024 • 8

upvoted 3 collections 3 months ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated about 4 hours ago • 214

📑 Trending Papers - October 🔟

10 items • Updated Dec 24, 2024 • 6

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 103