PILAF: Optimal Human Preference Sampling for Reward Modeling Paper • 2502.04270 • Published 13 days ago • 11
A Tale of Tails: Model Collapse as a Change of Scaling Laws Paper • 2402.07043 • Published Feb 10, 2024 • 15