Image-Text-to-Text
Safetensors
llava_llama
BoyuNLP commited on
Commit
1849123
·
verified ·
1 Parent(s): 87b58f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -66,7 +66,7 @@ UGround is a storng GUI visual grounding model trained with a simple recipe. Che
66
  | Qwen-VL | Qwen-VL | | 9.5 | 4.8 | 5.7 | 5.0 | 3.5 | 2.4 | 5.2 |
67
  | SeeClick | Qwen-VL | SeeClick | 78.0 | 52.0 | 72.2 | 30.0 | 55.7 | 32.5 | 53.4 |
68
  | Qwen-GUI | Qwen-VL | GUICourse | 52.4 | 10.9 | 45.9 | 5.7 | 43.0 | 13.6 | 28.6 |
69
- | **UGround-V1** | LLaVA-UGround-V1 | UGround-V1 | 82.8 | 60.3 | 82.5 | 63.6 | 80.4 | 70.4 | 73.3 |
70
  | Qwen2-VL | Qwen2-VL | | 61.3 | 39.3 | 52.0 | 45.0 | 33.0 | 21.8 | 42.1 |
71
  | Auguvis-G-7B | Qwen2-VL | Aguvis-Stage-1 | 88.3 | 78.2 | 88.1 | 70.7 | 85.7 | 74.8 | 81.0 |
72
  | Auguvis-7B | Qwen2-VL | Aguvis-Stage-1&2 | **95.6** | 77.7 | **93.8** | 67.1 | 88.3 | 75.2 | 83.0 |
@@ -86,11 +86,13 @@ UGround is a storng GUI visual grounding model trained with a simple recipe. Che
86
  | GPT-4o | Qwen-VL | Qwen-VL | | 21.3 | 21.4 | 18.6 | 10.7 | 9.1 | 5.8 | 14.5 |
87
  | GPT-4o | SeeClick | Qwen-VL | SeeClick | 81.0 | 59.8 | 69.6 | 33.6 | 43.9 | 26.2 | 52.4 |
88
  | GPT-4o | Qwen-GUI | Qwen-VL | GUICourse | 67.8 | 24.5 | 53.1 | 16.4 | 50.4 | 18.5 | 38.5 |
89
- | GPT-4o | UGround-V1 | LLaVA-UGround-V1 | UGround-V1 | 93.4 | 76.9 | 92.8 | 67.9 | 88.7 | 68.9 | 81.4 |
90
  | GPT-4o | OS-Atlas-Base-4B | InternVL | OS-Atlas | **94.1** | 73.8 | 77.8 | 47.1 | 86.5 | 65.3 | 74.1 |
91
  | GPT-4o | OS-Atlas-Base-7B | Qwen2-VL | OS-Atlas | 93.8 | **79.9** | 90.2 | 66.4 | **92.6** | **79.1** | 83.7 |
92
- | GPT-4o | UGround-V1-2B (Qwen2-VL) | Qwen2-VL | UGround-V1 | **94.1** | 77.7 | 92.8 | 63.6 | 90.0 | 70.9 | 81.5 |
93
- | GPT-4o | UGround-V1-7B (Qwen2-VL) | Qwen2-VL | UGround-V1 | **94.1** | **79.9** | **93.3** | **73.6** | 89.6 | 73.3 | **84.0** |
 
 
94
 
95
 
96
 
 
66
  | Qwen-VL | Qwen-VL | | 9.5 | 4.8 | 5.7 | 5.0 | 3.5 | 2.4 | 5.2 |
67
  | SeeClick | Qwen-VL | SeeClick | 78.0 | 52.0 | 72.2 | 30.0 | 55.7 | 32.5 | 53.4 |
68
  | Qwen-GUI | Qwen-VL | GUICourse | 52.4 | 10.9 | 45.9 | 5.7 | 43.0 | 13.6 | 28.6 |
69
+ | **UGround-V1** | LLaVA-UGround-V1 | UGround-V1 | **82.8** | **60.3** | **82.5** | **63.6** | **80.4** | **70.4** | **73.3** |
70
  | Qwen2-VL | Qwen2-VL | | 61.3 | 39.3 | 52.0 | 45.0 | 33.0 | 21.8 | 42.1 |
71
  | Auguvis-G-7B | Qwen2-VL | Aguvis-Stage-1 | 88.3 | 78.2 | 88.1 | 70.7 | 85.7 | 74.8 | 81.0 |
72
  | Auguvis-7B | Qwen2-VL | Aguvis-Stage-1&2 | **95.6** | 77.7 | **93.8** | 67.1 | 88.3 | 75.2 | 83.0 |
 
86
  | GPT-4o | Qwen-VL | Qwen-VL | | 21.3 | 21.4 | 18.6 | 10.7 | 9.1 | 5.8 | 14.5 |
87
  | GPT-4o | SeeClick | Qwen-VL | SeeClick | 81.0 | 59.8 | 69.6 | 33.6 | 43.9 | 26.2 | 52.4 |
88
  | GPT-4o | Qwen-GUI | Qwen-VL | GUICourse | 67.8 | 24.5 | 53.1 | 16.4 | 50.4 | 18.5 | 38.5 |
89
+ | GPT-4o | **UGround-V1** | LLaVA-UGround-V1 | UGround-V1 | **93.4** | **76.9** | **92.8** | **67.9** | **88.7** | **68.9** | **81.4** |
90
  | GPT-4o | OS-Atlas-Base-4B | InternVL | OS-Atlas | **94.1** | 73.8 | 77.8 | 47.1 | 86.5 | 65.3 | 74.1 |
91
  | GPT-4o | OS-Atlas-Base-7B | Qwen2-VL | OS-Atlas | 93.8 | **79.9** | 90.2 | 66.4 | **92.6** | **79.1** | 83.7 |
92
+ | GPT-4o | **UGround-V1-2B (Qwen2-VL)** | Qwen2-VL | UGround-V1 | **94.1** | 77.7 | 92.8 | 63.6 | 90.0 | 70.9 | 81.5 |
93
+ | GPT-4o | **UGround-V1-7B (Qwen2-VL)** | Qwen2-VL | UGround-V1 | **94.1** | **79.9** | **93.3** | **73.6** | 89.6 | 73.3 | **84.0** |
94
+
95
+
96
 
97
 
98