VLMEvalKit Evaluation Results Collection
A leaderboard for multimodal models
Create videos with FFMPEG + Qwen2.5-Coder
Engage in multi-modal conversations with images and videos