Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 3 days ago • 61
Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published 7 days ago • 55
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution Paper • 2502.18449 • Published 12 days ago • 67
VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing Paper • 2502.17258 • Published 13 days ago • 72
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information Paper • 2502.14258 • Published 17 days ago • 25
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 17 days ago • 128
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published 17 days ago • 177
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published 17 days ago • 160
Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator Paper • 2502.19204 • Published 11 days ago • 11
DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping Paper • 2502.20900 • Published 9 days ago • 7
Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids Paper • 2502.20396 • Published 10 days ago • 12
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Paper • 2503.01307 • Published 6 days ago • 30
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models Paper • 2503.01774 • Published 6 days ago • 37
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published 6 days ago • 65
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published Feb 3 • 112
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published Feb 3 • 186