Vince

bolerovt

bolerovt

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 hour ago

Token-Efficient Long Video Understanding for Multimodal LLMs

upvoted a paper about 1 hour ago

START: Self-taught Reasoner with Tools

upvoted a paper about 1 hour ago

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

View all activity

Organizations

None yet

bolerovt's activity

upvoted 7 papers about 1 hour ago

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published 3 days ago • 61

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published 3 days ago • 69

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers

Paper • 2503.00865 • Published 7 days ago • 55

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published 6 days ago • 59

Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published 12 days ago • 44

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published 12 days ago • 67

VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing

Paper • 2502.17258 • Published 13 days ago • 72

upvoted 11 papers 5 days ago

Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information

Paper • 2502.14258 • Published 17 days ago • 25

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 17 days ago • 128

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published 17 days ago • 177

LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

Paper • 2502.15007 • Published 17 days ago • 160

Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator

Paper • 2502.19204 • Published 11 days ago • 11

Towards an AI co-scientist

Paper • 2502.18864 • Published 11 days ago • 41

DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping

Paper • 2502.20900 • Published 9 days ago • 7

Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids

Paper • 2502.20396 • Published 10 days ago • 12

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published 6 days ago • 30

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Paper • 2503.01774 • Published 6 days ago • 37

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published 6 days ago • 65

upvoted 2 papers about 1 month ago

The Differences Between Direct Alignment Algorithms are a Blur

Paper • 2502.01237 • Published Feb 3 • 112

OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published Feb 3 • 186