Lewis Tunstall's picture

Lewis Tunstall PRO

lewtun

·

https://lewtun.github.io/blog/

AI & ML interests

LLMs, LLMs, LLMs

Recent Activity

upvoted a paper about 2 hours ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

updated a Space about 4 hours ago

open-r1/open-r1-eval-leaderboard

upvoted an article about 4 hours ago

Smol but Mighty: Can Small Models Reason well? 🤔

View all activity

Organizations

lewtun's activity

upvoted a paper about 2 hours ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 1 day ago • 37

upvoted an article about 4 hours ago

Article

Smol but Mighty: Can Small Models Reason well? 🤔

By

•

2 days ago

• 6

upvoted an article 4 days ago

Article

Open-R1: Update #1

By

and 7 others •

5 days ago

• 235

upvoted an article 6 days ago

Article

Replicating DeepSeek R1 for Information Extraction

By

•

6 days ago

• 28

upvoted an article 9 days ago

Article

Welcome to Inference Providers on the Hub 🔥

10 days ago

• 257

upvoted an article 10 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

10 days ago

• 646

upvoted an article 20 days ago

Article

Gradio spaces are the perfect agent tools\!

By

•

20 days ago

• 13

upvoted 2 articles 21 days ago

Article

Introducing smolagents: simple agents that write actions in code.

Dec 31, 2024

• 564

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

22 days ago

• 63

upvoted a paper 29 days ago

A Simple and Provable Scaling Law for the Test-Time Compute of Large Language Models

Paper • 2411.19477 • Published Nov 29, 2024 • 6

upvoted 5 papers about 1 month ago

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 79

Evaluating Large Language Models Trained on Code

Paper • 2107.03374 • Published Jul 7, 2021 • 8

Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models

Paper • 1610.02424 • Published Oct 7, 2016 • 1

Scaling Laws for Neural Language Models

Paper • 2001.08361 • Published Jan 23, 2020 • 7

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 49

upvoted a paper about 2 months ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 126

upvoted a collection about 2 months ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 132

upvoted 2 papers about 2 months ago

Solving math word problems with process- and outcome-based feedback

Paper • 2211.14275 • Published Nov 25, 2022 • 8

Self-Consistency Improves Chain of Thought Reasoning in Language Models

Paper • 2203.11171 • Published Mar 21, 2022 • 3

upvoted a collection 3 months ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated about 2 hours ago • 213