AI Phone Leaderboard
More advanced and challenging multi-task evaluation
Track, rank and evaluate open LLMs and chatbots