SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 9 days ago • 100
bartowski/uncensoredai_UncensoredLM-DeepSeek-R1-Distill-Qwen-14B-GGUF Text Generation • Updated 5 days ago • 19.6k • 9
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Paper • 2501.18511 • Published 7 days ago • 17
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 10 items • Updated 8 days ago • 86