ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery Paper • 2410.05080 • Published Oct 7 • 19
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents Paper • 2410.05243 • Published Oct 7 • 16
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization Paper • 2405.15071 • Published May 23 • 37
INaturalist-2021 Fine-tunes Collection Fine-tune experiments for various `timm` models on the INaturalist 2021 Challenge dataset (https://github.com/visipedia/inat_comp/tree/master/2021) • 5 items • Updated Oct 25, 2023 • 6
MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning Paper • 2309.05653 • Published Sep 11, 2023 • 10