Commit History

updatae db
eadf8af
Running

Zachary Siegel commited on

fix typo in agent name
7ccdf25

Zachary Siegel commited on

verify o1 mini
56eb4b8

Zachary Siegel commited on

update claude and o1 mini
9bb290b

Zachary Siegel commited on

Upload preprocessed_traces.db
34abc58
verified

siegelz commited on

Upload preprocessed_traces.db
14d2f33
verified

siegelz commited on

update verified agents
ffe8812

Zachary Siegel commited on

Merge branch 'main' of https://huggingface.co./spaces/agent-evals/core_leaderboard
1453401

Zachary Siegel commited on

update verification
8b95f64

Zachary Siegel commited on

Upload preprocessed_traces.db
1090acb
verified

siegelz commited on

update corebench results
fd01afd

Zachary Siegel commited on

Merge branch 'main' of https://huggingface.co./spaces/agent-evals/core_leaderboard
46c98e2

Zachary Siegel commited on

Upload preprocessed_traces.db
6162ff9
verified

siegelz commited on

fix typo and verify
67c84a0

Zachary Siegel commited on

Upload preprocessed_traces.db
acc08a8
verified

siegelz commited on

added o1 mini and claude 35 sonnett
8e71027

Zachary Siegel commited on

add successful and failed tasks
9381c67

Zachary Siegel commited on

added evals live
705e0d0

Zachary Siegel commited on

update title
bc0f99c

Zachary Siegel commited on

submit to any of the three levels
8fafb33

Zachary Siegel commited on

email link if files too big
22212ce

Zachary Siegel commited on

edits to submission instructions
fd7b6c5

Zachary Siegel commited on

note about file modification
a1b0cc7

Zachary Siegel commited on

correct formatting
8b52643

Zachary Siegel commited on

Merge branch 'main' of https://huggingface.co./spaces/agent-evals/core_leaderboard
290d5e9

Zachary Siegel commited on

agent submission instructions
2c91b5e

Zachary Siegel commited on

Upload preprocessed_traces.db
51c839c
verified

siegelz commited on

verify the agents
abf78cc

Zachary Siegel commited on

add results to leaderboard
8de3f0a

Zachary Siegel commited on

remove agents
797d23f

Zachary Siegel commited on

added first agent to leaderboard
64319c0

Zachary Siegel commited on

scaffold for core bench
b335ab8

Zachary Siegel commited on

core bench outline
2faf3bd

Zachary Siegel commited on

Upload preprocessed_traces.db
de4df51
verified

benediktstroebl commited on

Upload verified_agents.yaml
e92240d
verified

benediktstroebl commited on

Upload requirements.txt
b56511a
verified

benediktstroebl commited on

Upload preprocessed_traces.db
7db4465
verified

benediktstroebl commited on