Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper โข 2502.05171 โข Published 21 days ago โข 120
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper โข 2502.02492 โข Published 24 days ago โข 57
Unifying Specialized Visual Encoders for Video Language Models Paper โข 2501.01426 โข Published Jan 2 โข 21
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems Paper โข 2407.01370 โข Published Jul 1, 2024 โข 86
Salesforce/xgen-mm-phi3-mini-instruct-r-v1 Image-Text-to-Text โข Updated 26 days ago โข 1.17k โข 184
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild Paper โข 2305.11147 โข Published May 18, 2023 โข 3