23 157 544

Florent Daudens

fdaudens

AI & ML interests

AI & Journalism

Recent Activity

upvoted a paper about 2 hours ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

upvoted a paper about 2 hours ago

Fully Autonomous AI Agents Should Not be Developed

liked a Space about 14 hours ago

deepseek-ai/Janus-Pro-7B

View all activity

Organizations

fdaudens's activity

upvoted 2 papers about 2 hours ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 2 days ago • 62

Fully Autonomous AI Agents Should Not be Developed

Paper • 2502.02649 • Published 2 days ago • 8

liked a Space about 14 hours ago

1.46k

Chat With Janus-Pro-7B

🌍

A unified multimodal understanding and generation model.

upvoted an article 1 day ago

Article

DABStep: Data Agent Benchmark for Multi-step Reasoning

3 days ago

• 26

liked a Space 1 day ago

162

Chat with DeepSeek-VL2-small

🌍

upvoted 2 articles 2 days ago

Article

🌁#86: Four Freedoms of truly open AI

and 1 other •

3 days ago

• 5

Article

From Hippocrates to AI: Reflections on the Evolution of Consent

•

2 days ago

• 8

liked a Space 2 days ago

114

Open Deep-Research

🏆

OpenAI's Deep Research, but open

upvoted 2 articles 2 days ago

Article

Open-source DeepResearch – Freeing our search agents

3 days ago

• 636

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

3 days ago

• 67

reacted to merve's post with 👍 3 days ago

Post

3717

This week in open AI was 🔥 Let's recap! 🤗 merve/january-31-releases-679a10669bd4030090c5de4d
LLMs 💬
> Huge: AllenAI released new Tülu models that outperform DeepSeek R1 using Reinforcement Learning with Verifiable Reward (RLVR) based on Llama 3.1 405B 🔥
> Mistral AI is back to open-source with their "small" 24B models (base & SFT), with Apache 2.0 license 😱
> Alibaba Qwen released their 1M context length models Qwen2.5-Instruct-1M, great for agentic use with Apache 2.0 license 🔥
> Arcee AI released Virtuoso-medium, 32.8B LLMs distilled from DeepSeek V3 with dataset of 5B+ tokens
> Velvet-14B is a new family of 14B Italian LLMs trained on 10T tokens in six languages
> OpenThinker-7B is fine-tuned version of Qwen2.5-7B-Instruct on OpenThoughts dataset

VLMs & vision 👀
> Alibaba Qwen is back with Qwen2.5VL, amazing new capabilities ranging from agentic computer use to zero-shot localization 🔥
> NVIDIA released new series of Eagle2 models with 1B and 9B sizes
> DeepSeek released Janus-Pro, new any-to-any model (image-text generation from image-text input) with MIT license
> BEN2 is a new background removal model with MIT license!

Audio 🗣️
> YuE is a new open-source music generation foundation model, lyrics-to-song generation

Codebase 👩🏻‍💻
> We are open-sourcing our SmolVLM training and eval codebase! https://github.com/huggingface/smollm/tree/main/vision
> Open-R1 is open-source reproduction of R1 by @huggingface science team https://huggingface.co/blog/open-r1

1 reply

updated a Space 3 days ago

Deepseek Download Stats

🌍

DeepSeek download stats

upvoted an article 4 days ago

Article

The AI tools for Art Newsletter - Issue 1

7 days ago

• 44

posted an update 4 days ago

Post

2284

📊 R1 just built its own download dashboard!

Some fresh stats: +6M downloads for 800+ derivative models vs 2M for originals. Watch the numbers grow here: fdaudens/deepseek-download-stats

upvoted an article 5 days ago

Article

Open-R1: Update #1

and 7 others •

5 days ago

• 237

published a Space 7 days ago

Deepseek Download Stats

🌍

DeepSeek download stats

posted an update 7 days ago

Post

3211

🎯 Kokoro TTS just hit v1.0! 🚀

Small but mighty: 82M parameters, runs locally, speaks multiple languages. The best part? It's Apache 2.0 licensed!
This could unlock so many possibilities ✨

Check it out: hexgrad/Kokoro-82M