ContextualBench_Leaderboard / caption_vqa.tsv
ToughStone's picture
Upload caption_vqa.tsv
fdca812 verified
raw
history blame
387 Bytes
Model Bleu-4 Rouge-L Cider WUPS@0.9 WUPS@0.0
BLIP-2 10.2 25.4 18.8 42.3 65.9
InstructBLIP 4.2 21.1 0.0 42.2 67.9
mPLUG-Owl 3.0 15.4 0.1 13.6 42.0
mPLUG-Owl2 4.4 18.1 0.0 16.7 43.9
LLaVA-v1.5 4.3 17.8 0.1 17.1 43.9
LLaVA-v1.6 2.6 13.6 0.0 9.7 38.1
MMICL 10.8 25.2 22.1 47.9 72.1
OpenFlamingo 10.5 27.1 26.6 18.0 32.1
Otter 12.2 26.3 25.7 20.1 33.6
GPT-4V 4.5 18.1 0.0 19.9 542.4