microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated about 4 hours ago • 7.35k • 513
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 8 days ago • 118