--- title: BenchBench Leaderboad emoji: 🏋️‍♂️ colorFrom: gray colorTo: blue sdk: streamlit sdk_version: 1.36.0 app_file: app.py pinned: true license: apache-2.0 --- Check out the configuration reference at https://huggingface.co./docs/hub/spaces-config-reference ``` @misc{perlitz2024benchmarkagreementtestingright, title={Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation}, author={Yotam Perlitz and Ariel Gera and Ofir Arviv and Asaf Yehudai and Elron Bandel and Eyal Shnarch and Michal Shmueli-Scheuer and Leshem Choshen}, year={2024}, eprint={2407.13696}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2407.13696}, } ```