Running 13 13 R1-distilled leaderboard ⚡ Generate a leaderboard for open-r1 models using benchmark scores