Training Software Engineering Agents and Verifiers with SWE-Gym Paper • 2412.21139 • Published 13 days ago • 20
Law of the Weakest Link: Cross Capabilities of Large Language Models Paper • 2409.19951 • Published Sep 30, 2024 • 54
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23, 2024 • 69
Advancing LLM Reasoning Generalists with Preference Trees Paper • 2404.02078 • Published Apr 2, 2024 • 44
Executable Code Actions Elicit Better LLM Agents Paper • 2402.01030 • Published Feb 1, 2024 • 40
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback Paper • 2309.10691 • Published Sep 19, 2023 • 4