view post Post 427 AutoBench 1.0 is live. The Collective-LLM-as-a-Judge model benchmarkhttps://huggingface.co./blog/PeterKruger/autobench See translation 👀 1 1 + Reply