V2PE - a OpenGVLab Collection

OpenGVLab 's Collections

VideoChat-Flash

InternVL2.5-MPO

V2PE

InternVL Adaptation

All-Seeing Project

PVT v2

V2PE

updated Jan 10

Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding

OpenGVLab/V2PE

Updated Dec 13, 2024 • 4
OpenGVLab/V2PE-Data

Preview • Updated Dec 14, 2024 • 647 • 6
V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding

Paper • 2412.09616 • Published Dec 12, 2024 • 1