3 68 137

YangWang92

yangwang92

AI & ML interests

None yet

Recent Activity

liked a dataset 2 days ago

PRIME-RL/Eurus-2-Rollout

liked a dataset 2 days ago

PRIME-RL/EurusPRM-Stage1-Data

liked a dataset 2 days ago

PRIME-RL/Eurus-2-SFT-Data

View all activity

Organizations

yangwang92's activity

liked 4 datasets 2 days ago

upvoted a collection 4 days ago

Reasoning Datasets

Collection

Distilled synthetic Reasoning datasets • 7 items • Updated 4 days ago • 45

liked a model 6 days ago

allenai/Llama-3.1-Tulu-3-405B

Text Generation • Updated 8 days ago • 705 • 88

upvoted a paper 6 days ago

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 6

liked a dataset 8 days ago

bespokelabs/Bespoke-Stratos-17k

Viewer • Updated 7 days ago • 16.7k • 38.8k • 219

upvoted a paper 10 days ago

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published 14 days ago • 22

upvoted an article 12 days ago

Article

Process Reinforcement through Implicit Rewards

and 1 other •

Jan 3

• 22

upvoted a paper 12 days ago

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Paper • 2501.13629 • Published 14 days ago • 42

liked a model 13 days ago

ezelikman/quietstar-8-ahead

Text Generation • Updated Mar 23, 2024 • 206 • 90

upvoted 3 papers 14 days ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 16 days ago • 86

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 15 days ago • 301

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published 17 days ago • 63

upvoted a paper 15 days ago

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Paper • 2501.12202 • Published 16 days ago • 33

liked 3 models 17 days ago

deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Text Generation • Updated 5 days ago • 313k • 441

deepseek-ai/DeepSeek-R1-Zero

Text Generation • Updated 5 days ago • 25.7k • 724

deepseek-ai/DeepSeek-R1

Text Generation • Updated 5 days ago • 1.54M • • 7.25k

liked a model 20 days ago

LLM360/K2

Text Generation • Updated Jul 29, 2024 • 2k • 85