view post Post 427 AutoBench 1.0 is live. The Collective-LLM-as-a-Judge model benchmarkhttps://huggingface.co./blog/PeterKruger/autobench See translation ๐ 1 1 + Reply
view article Article Escape the Benchmark Trap: AutoBench โ the Collective-LLM-as-a-Judge System for Evaluating AI models (ASI-Ready!) By PeterKruger โข 5 days ago โข 6
view article Article Escape the Benchmark Trap: AutoBench โ the Collective-LLM-as-a-Judge System for Evaluating AI models (ASI-Ready!) By PeterKruger โข 5 days ago โข 6