Unifying Vision, Text, and Layout for Universal Document Processing Paper • 2212.02623 • Published Dec 5, 2022 • 11
Hf-native ColVision Models Collection Models that can be used with the native transformers 🤗 implementation instead of colpali-engine. • 2 items • Updated Jan 23 • 3
mHuBERT-147 models Collection Compact yet powerful multilingual speech representation models based on the HuBERT architecture. • 3 items • Updated Jun 4, 2024 • 8
Running 543 543 Vision Arena (Testing VLMs side-by-side) 🖼 Analyze images to detect and label objects
Running 526 526 Scaling test-time compute 📈 Enhance math problem solving by scaling test-time compute