AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? Paper • 2412.02611 • Published Dec 3, 2024 • 23
view article Article Key Insights into the Law of Vision Representations in MLLMs By Borise • Sep 2, 2024 • 17
HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption Paper • 2310.01779 • Published Oct 3, 2023 • 4