133
VideoLLaMA2
π₯
Media understanding
Media understanding
Generate detailed image descriptions and highlight objects
Compare different visual question answering
A unified multimodal understanding and generation model.
Generate answers to questions about images
Chat with an AI that understands images and text
Ask questions about images and get answers
ViLT VQA with FlanT5 and Translations
Answer queries and manipulate images using text input
Try XAI's Grok 2 vision model