VLMEvalKit Evaluation Results Collection
Generate images from text prompts
Ask questions about YouTube video content