poterliu's picture
6 25

poterliu

poterliu
Β·

AI & ML interests

None yet

Recent Activity

upvoted a collection 7 days ago
Deepseek Papers
liked a Space 8 days ago
Qwen/Qwen2.5-Max-Demo
liked a Space 10 days ago
akhaliq/anychat
View all activity

Organizations

None yet

poterliu's activity

reacted to lewtun's post with πŸ€— 11 days ago
view post
Post
9886
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

πŸ§ͺ Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

πŸ”₯ Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1
Β·
reacted to onekq's post with πŸ‘ 11 days ago
view post
Post
2271
So πŸ‹DeepSeekπŸ‹ hits the mainstream media. But it has been a star in our little cult for at least 6 months. Its meteoric success is not overnight, but two years in the making.

To learn their history, just look at their πŸ€— repo https://huggingface.co/deepseek-ai

* End of 2023, they launched the first model (pretrained by themselves) following Llama 2 architecture
* June 2024, v2 (MoE architecture) surpassed Gemini 1.5, but behind Mistral
* September, v2.5 surpassed GPT 4o mini
* December, v3 surpassed GPT 4o
* Now R1 surpassed o1

Most importantly, if you think DeepSeek success is singular and unrivaled, that's WRONG. The following models are also near or equal the o1 bar.

* Minimax-01
* Kimi k1.5
* Doubao 1.5 pro
  • 1 reply
Β·