Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities Paper β’ 2308.12966 β’ Published Aug 24, 2023 β’ 8
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models Paper β’ 2311.07919 β’ Published Nov 14, 2023 β’ 10
Audio Dialogues: Dialogues dataset for audio and music understanding Paper β’ 2404.07616 β’ Published Apr 11, 2024 β’ 16
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper β’ 2412.10360 β’ Published Dec 13, 2024 β’ 139