Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
Duplicated fromย
benediktstroebl/hal
agent-evals
/
core_leaderboard
like
0
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
cb163b3
core_leaderboard
/
evals_live
3 contributors
History:
16 commits
benediktstroebl
Upload 3 files
c941390
verified
7 months ago
swebench_lite_example_agent_1722587866.json
Safe
8.44 kB
LFS
update
7 months ago
swebench_lite_example_agent_17227906123.json
Safe
10.4 kB
LFS
update
7 months ago
swebench_lite_example_agent_1722790656.json
Safe
10.4 kB
LFS
update
7 months ago
usaco_USACO_Episodic_gpt-4o-mini-2024-07-18_1723429624.json
Safe
435 MB
LFS
Upload usaco_USACO_Episodic_gpt-4o-mini-2024-07-18_1723429624.json
7 months ago
usaco_USACO_Reflexion__Episodic_gpt-4o-mini-2024-07-18_1723512712.json
Safe
836 MB
LFS
Upload 3 files
7 months ago
usaco_USACO_Reflexion__Semantic_gpt-4o-mini-2024-07-18_1723548931.json
Safe
1.36 GB
LFS
Upload 3 files
7 months ago
usaco_USACO_Reflexion_gpt-4o-mini-2024-07-18_1723501533.json
Safe
1.19 GB
LFS
Upload 3 files
7 months ago
usaco_USACO_Semantic_gpt-4o-mini-2024-07-18_1723431631.json
Safe
1.22 GB
LFS
Upload usaco_USACO_Semantic_gpt-4o-mini-2024-07-18_1723431631.json
7 months ago
usaco_USACO_Zero-shot_gpt-4o-mini-2024-07-18_1723417375.json
Safe
790 MB
LFS
fixed broken fle
7 months ago