This collection contains 4 initial checkpoints for https://github.com/LeslieTrue/SFTvsRL and necessary data for V-IRL training.
Tianzhe
tianzhechu
AI & ML interests
None yet
Recent Activity
authored
a paper
8 days ago
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model
Post-training
published
a dataset
8 days ago
tianzhechu/SFTvsRL_Data
updated
a dataset
8 days ago
tianzhechu/SFTvsRL_Data
Organizations
None yet