2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 11 days ago • 92
On Domain-Specific Post-Training for Multimodal Large Language Models Paper • 2411.19930 • Published Nov 29, 2024 • 25
DressRecon: Freeform 4D Human Reconstruction from Monocular Video Paper • 2409.20563 • Published Sep 30, 2024 • 8