Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
Duplicated fromย
benediktstroebl/hal
agent-evals
/
core_leaderboard
like
0
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
a2d5cb2
core_leaderboard
/
utils
3 contributors
History:
11 commits
benediktstroebl
added failure report and two new swebench variants
5a7e21a
7 months ago
data.py
Safe
9.47 kB
format update and added monitor llm client backend
7 months ago
pareto.py
Safe
1.34 kB
big update with raw predictions section and dropdowns that dynamically parse agents of current leaderboard
7 months ago
processing.py
Safe
6.27 kB
added failure report and two new swebench variants
7 months ago
viz.py
Safe
10.3 kB
added failure report and two new swebench variants
7 months ago