view post Post 425 AutoBench 1.0 is live. The Collective-LLM-as-a-Judge model benchmarkhttps://huggingface.co./blog/PeterKruger/autobench See translation
Article 6 Escape the Benchmark Trap: AutoBench โ the Collective-LLM-as-a-Judge System for Evaluating AI models (ASI-Ready!)