view post Post 1956 🔥🔥Introducing Ola! State-of-the-art omni-modal understanding model with advanced progressive modality alignment strategy!Ola ranks #1 on OpenCompass Leaderboard (<10B). 📜Paper: https://arxiv.org/abs/2502.04328🛠️Code: https://github.com/Ola-Omni/Ola🛠️We have fully released our video&audio training data, intermediate image&video model at THUdyh/ola-67b8220eb93406ec87aeec37. Try to build your own powerful omni-modal model with our data and models! See translation 👀 3 3 + Reply
Ola Collection Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment • 4 items • Updated 8 days ago • 2
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 8 days ago • 118