Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong Paper • 2501.09775 • Published 8 days ago • 26
Textoon: Generating Vivid 2D Cartoon Characters from Text Descriptions Paper • 2501.10020 • Published 7 days ago • 21
GameFactory: Creating New Games with Generative Interactive Videos Paper • 2501.08325 • Published 10 days ago • 57
CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation Paper • 2501.09433 • Published 8 days ago • 17
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation Paper • 2501.09755 • Published 8 days ago • 33
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 8 days ago • 64
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning Paper • 2501.06458 • Published 13 days ago • 29
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Paper • 2501.06282 • Published 14 days ago • 39
Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding Paper • 2501.07888 • Published 10 days ago • 13
Diffusion Adversarial Post-Training for One-Step Video Generation Paper • 2501.08316 • Published 10 days ago • 32
3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering Paper • 2501.05131 • Published 15 days ago • 33
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 10 days ago • 268
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 10 days ago • 47
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration Paper • 2501.01320 • Published 22 days ago • 11
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models Paper • 2501.01423 • Published 22 days ago • 36
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 23 days ago • 97
LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models Paper • 2501.00874 • Published 23 days ago • 12