leaderboards - a MoritzLaurer Collection

MoritzLaurer 's Collections

prompt-templates

Zeroshot Classifiers

other-interesting

code generation

leaderboards

updated 25 days ago

Running on CPU Upgrade

62

62

LeaderboardExplorer

🔎

Filter and display leaderboards based on selected criteria
Running

3.96k

3.96k

Chatbot Arena Leaderboard

🏆
Running on CPU Upgrade

12.4k

12.4k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots
Running on CPU Upgrade

4.7k

4.7k

MTEB Leaderboard

🥇

Select and filter benchmarks for text embedding tasks
Running on CPU Upgrade

609

609

Open ASR Leaderboard

🏆

Request evaluation results for a speech model
Running

417

417

LLM-Perf Leaderboard

🏆

Explore hardware performance for language models
Running

1.11k

1.11k

Big Code Models Leaderboard

📈

Submit code models for evaluation on benchmarks
Runtime error

78

78

Human & GPT-4 Evaluation of LLMs Leaderboard

👩
Running

430

430

Can Ai Code Results

🏆

Generate animated avatars from images
Running on CPU Upgrade

128

128

Hallucinations Leaderboard

🔥

View and submit LLM evaluations
Runtime error

104

104

Enterprise Scenarios Leaderboard

🥇
Running on CPU Upgrade

87

87

LLM Safety Leaderboard

🥇

View and submit machine learning model evaluations
Running

539

539

Vision Arena (Testing VLMs side-by-side)

🖼

Analyze images to detect and label objects
Running

59

59

CyberSecEvalTest

📈

Evaluate LLM cybersecurity risks
Running

41

41

Redteaming Resistance Leaderboard

💻

Display model benchmark results
Running

52

52

Arena Hard

🦾

Compare model answers to questions
Running

275

275

LLM Performance Leaderboard

🐨

View LLM Performance Leaderboard
Running on CPU Upgrade

66

66

AIR-Bench Leaderboard

🥇

Explore benchmark results for QA and long doc models
Running on CPU Upgrade

595

595

Open VLM Leaderboard

🌎

VLMEvalKit Evaluation Results Collection
Running

326

326

Reward Bench Leaderboard

📐

Explore and analyze RewardBench leaderboard data
Running

177

177

BigCodeBench Leaderboard

🥇

Explore and analyze code evaluation data
Running

10

10

MJ Bench Leaderboard

🥇

Display and filter multimodal model leaderboard results
Running

89

89

MTEB Arena

⚔

Teach, test, evaluate language models with MTEB Arena
Running on CPU Upgrade

146

146

Open LLM Progress Tracker

🔬

Visualize LLM progress with interactive filters
Running

88

88

Judge Arena

💻

Compare AI models by voting on responses
Running on Zero

283

283

TTS Spaces Arena

🤗

Blind vote on HF TTS models!