This leaderboard has been evaluating LLMs from Jun 2024 on IFEval, MuSR, GPQA, MATH, BBH and MMLU-Pro
-
113
Open-LLM performances are plateauing, let’s make the leaderboard steep again
🏔Update leaderboard for fair model evaluation
-
12.6k
Open LLM Leaderboard
🏆Track, rank and evaluate open LLMs and chatbots
-
open-llm-leaderboard/contents
Viewer • Updated • 4.22k • 16k • 14 -
open-llm-leaderboard/results
Preview • Updated • 86.9k • 9