Zhiyuan Ning's picture

20

Zhiyuan Ning

nzynzy

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Chain-of-Retrieval Augmented Generation

upvoted a paper 3 days ago

Qwen2.5-1M Technical Report

upvoted a paper 3 days ago

Process Reinforcement through Implicit Rewards

View all activity

Organizations

None yet

nzynzy's activity

upvoted 3 papers 3 days ago

Chain-of-Retrieval Augmented Generation

Paper • 2501.14342 • Published 14 days ago • 48

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 12 days ago • 54

Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published 4 days ago • 53

upvoted 6 papers 4 days ago

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published 21 days ago • 105

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published 18 days ago • 90

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Paper • 2501.17703 • Published 9 days ago • 51

s1: Simple test-time scaling

Paper • 2501.19393 • Published 7 days ago • 88

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 51

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 59

upvoted 8 papers 5 days ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 77

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 345

Diving into Self-Evolving Training for Multimodal Reasoning

Paper • 2412.17451 • Published Dec 23, 2024 • 43

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 46

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 98

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 30 days ago • 253

Enabling Scalable Oversight via Self-Evolving Critic

Paper • 2501.05727 • Published 28 days ago • 70

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 10 days ago • 100

upvoted a paper 10 days ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 16 days ago • 86

upvoted a paper 12 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 16 days ago • 301

upvoted a collection 4 months ago

LLM Reasoning Papers

Papers to improve reasoning capabilities of LLMs • 20 items • Updated 23 days ago • 113