Submitted by che111 41 MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning · 9 authors 1
Submitted by zhoutianyi 32 R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts · 3 authors 3
Submitted by Guizhen 18 FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving · 9 authors 1
Submitted by shuaishuaicdp 15 CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale · 9 authors 1
Submitted by JiangYi 12 UniTok: A Unified Tokenizer for Visual Generation and Understanding · 8 authors 1
Submitted by keanudicap 9 Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance · 9 authors 1
Submitted by akhaliq 9 FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute · 10 authors 1
Submitted by BestWishYsh 8 Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think · 8 authors 1
Submitted by AlignAI 7 Guardians of the Agentic System: Preventing Many Shots Jailbreak with Agentic System · 6 authors 1
Submitted by akhaliq 7 Mobius: Text to Seamless Looping Video Generation via Latent Shift · 7 authors 1
Submitted by mizersy 7 SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning · 6 authors 1
Submitted by thuhsy 6 Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting · 6 authors 1
Submitted by akhaliq 6 R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning · 13 authors 1
Submitted by OliverRen 5 Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation · 6 authors 1
Submitted by imsuperkong 1 Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling · 3 authors 1