Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
Duplicated fromย
benediktstroebl/hal
agent-evals
/
core_leaderboard
like
0
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
338177f
core_leaderboard
/
evals_live
3 contributors
History:
22 commits
benediktstroebl
Upload swebench_verified_Agentless_gpt-4o-mini-2024-07-18_50_Instances_1723916965.json
01fb261
verified
7 months ago
swebench_lite_example_agent_17227906123.json
Safe
22 kB
LFS
added failure report and two new swebench variants
7 months ago
swebench_verified_Agentless_gpt-4o-mini-2024-07-18_50_Instances_1723916965.json
Safe
8.96 MB
LFS
Upload swebench_verified_Agentless_gpt-4o-mini-2024-07-18_50_Instances_1723916965.json
7 months ago
usaco_USACO_Episodic_gpt-4o-mini-2024-07-18_1723429624.json
Safe
435 MB
LFS
Upload usaco_USACO_Episodic_gpt-4o-mini-2024-07-18_1723429624.json
7 months ago
usaco_USACO_Reflexion__Episodic__Semantic_gpt-4o-mini-2024-07-18_1723558382.json
Safe
511 MB
LFS
Upload usaco_USACO_Reflexion__Episodic__Semantic_gpt-4o-mini-2024-07-18_1723558382.json
7 months ago
usaco_USACO_Reflexion__Episodic_gpt-4o-mini-2024-07-18_1723512712.json
Safe
836 MB
LFS
Upload 3 files
7 months ago
usaco_USACO_Reflexion__Semantic_gpt-4o-mini-2024-07-18_1723548931.json
Safe
1.36 GB
LFS
Upload 3 files
7 months ago
usaco_USACO_Reflexion_gpt-4o-mini-2024-07-18_1723501533.json
Safe
1.19 GB
LFS
Upload 3 files
7 months ago
usaco_USACO_Semantic_gpt-4o-mini-2024-07-18_1723431631.json
Safe
1.22 GB
LFS
Upload usaco_USACO_Semantic_gpt-4o-mini-2024-07-18_1723431631.json
7 months ago
usaco_USACO_Zero-shot_gpt-4o-mini-2024-07-18_1723417375.json
Safe
790 MB
LFS
fixed broken fle
7 months ago