PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs Paper • 2402.08657 • Published Feb 13 • 1
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models Paper • 2410.06154 • Published 9 days ago • 15