SFTvsRL Models & Data Collection This collection contains 4 initial checkpoints for https://github.com/LeslieTrue/SFTvsRL and necessary data for V-IRL training. • 5 items • Updated 7 days ago • 7
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 14 days ago • 101