3 68 137

YangWang92

yangwang92

AI & ML interests

None yet

Recent Activity

liked a dataset 2 days ago

PRIME-RL/Eurus-2-Rollout

liked a dataset 2 days ago

PRIME-RL/EurusPRM-Stage1-Data

liked a dataset 2 days ago

PRIME-RL/Eurus-2-SFT-Data

View all activity

Organizations

yangwang92's activity

upvoted a collection 4 days ago

Reasoning Datasets

Collection

Distilled synthetic Reasoning datasets • 7 items • Updated 4 days ago • 45

upvoted a paper 6 days ago

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 6

upvoted a paper 10 days ago

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published 14 days ago • 22

upvoted an article 12 days ago

Article

Process Reinforcement through Implicit Rewards

and 1 other •

Jan 3

• 22

upvoted a paper 12 days ago

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Paper • 2501.13629 • Published 14 days ago • 42

upvoted 3 papers 14 days ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 16 days ago • 86

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 15 days ago • 301

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published 17 days ago • 63

upvoted a paper 15 days ago

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Paper • 2501.12202 • Published 16 days ago • 33

upvoted a paper 20 days ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 21 days ago • 67

upvoted a paper 22 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 23 days ago • 272

upvoted a paper 24 days ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published 27 days ago • 60

upvoted a paper 28 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 29 days ago • 253

upvoted 3 papers 29 days ago

Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published about 1 month ago • 68

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published about 1 month ago • 48

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 90

upvoted 2 papers 30 days ago

Test-time Computing: from System-1 Thinking to System-2 Thinking

Paper • 2501.02497 • Published Jan 5 • 41

Scaling Laws for Floating Point Quantization Training

Paper • 2501.02423 • Published Jan 5 • 26

upvoted a collection about 1 month ago

Cosmos

Collection

The collection of Cosmos models • 31 items • Updated 20 days ago • 254

upvoted a paper about 1 month ago

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published Jan 3 • 42