view article Article Key Insights into the Law of Vision Representations in MLLMs By Borise • Sep 2 • 18
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding Paper • 2405.08344 • Published May 14 • 12
Compositional Text-to-Image Generation with Dense Blob Representations Paper • 2405.08246 • Published May 14 • 12
Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning Paper • 2405.08054 • Published May 13 • 21