How Far is Video Generation from World Model: A Physical Law Perspective Paper • 2411.02385 • Published 5 days ago • 27
Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets Paper • 2410.01779 • Published Oct 2 • 1
AMD-OLMo Collection AMD-OLMo are a series of 1 billion parameter language models trained by AMD on AMD Instinct™ MI250 GPUs based on OLMo. • 4 items • Updated 9 days ago • 16
Sparsh Collection Models and datasets for Sparsh: Self-supervised touch representations for vision-based tactile sensing • 15 items • Updated 17 days ago • 9
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 8 items • Updated 3 days ago • 89
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions Paper • 2410.20424 • Published 14 days ago • 36
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces Paper • 2410.09918 • Published 27 days ago • 2
SurCo: Learning Linear Surrogates For Combinatorial Nonlinear Optimization Problems Paper • 2210.12547 • Published Oct 22, 2022 • 1
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping Paper • 2402.14083 • Published Feb 21 • 47
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks Paper • 2407.05291 • Published Jul 7 • 1
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? Paper • 2403.07718 • Published Mar 12 • 1
steiner-preview Collection Reasoning models trained on synthetic data using reinforcement learning. • 3 items • Updated 21 days ago • 23
view article Article Transformers.js v3: WebGPU support, new models & tasks, and more… 19 days ago • 58
AutoTrain: No-code training for state-of-the-art models Paper • 2410.15735 • Published 20 days ago • 55
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering Paper • 2405.15793 • Published May 6 • 1