qazimbhat1 commited on
Commit
262ecb0
1 Parent(s): 1343bb5

Update README_code_model.md

Browse files
Files changed (1) hide show
  1. README_code_model.md +4 -4
README_code_model.md CHANGED
@@ -71,7 +71,7 @@ aiming to assess perceptual and cognitive capability of MLLMs within 14 sub-task
71
  | CrystalChat-7B | 1456.53 | **308.21** | 86.96 | 67.77 | **57.84** |
72
  | Vicuna-7B | **1481.12** | 302.85 | **87.174** | **67.97** | 56.49 |
73
 
74
- *Table 1: Comparison of different LLM backbones on visual language understanding benchmarks. All models are instruction-tuned on the general domain data (i.e. LLaVA)*
75
 
76
 
77
 
@@ -103,7 +103,7 @@ The dataset chosen was created by LLaVA with academic-task-oriented VQA data mix
103
  | VG [25] | 86K | Provide the bounding box coordinate of the region this sentence describes. |
104
  | **Total** | **665K** | |
105
 
106
- *Table 2. Instruction-following Data Mixture of LLaVA-1.5.*
107
 
108
  #### Web2Code Data
109
 
@@ -130,7 +130,7 @@ DWU<sub>R</sub>: We refined the WebSRC question-answer data to improve its quali
130
  | **Avg DOM Depth** | 5.3±1.0 | 6.5±1.0 |
131
  | **Avg Unique Tags** | 13.6±2.7 | 13.5±2.5 |
132
 
133
- *Table 3. DWCG is a newly generated GPT-3.5-based dataset, while DWCG<sub>R</sub> is the refined dataset that utilizes WebSight and Pix2Code datasets*
134
 
135
 
136
  ### Webpage Understanding Datasets
@@ -140,7 +140,7 @@ DWU<sub>R</sub>: We refined the WebSRC question-answer data to improve its quali
140
  | **Instruction** | ✓ | ✓ |
141
  | **Size** | 243.5K | 51.5K |
142
 
143
- *Table 4. Distribution of DWU and DWU<sub>R</sub> datasets. Both datasets include high-quality question-answer pairs for webpage understanding.*
144
 
145
 
146
 
 
71
  | CrystalChat-7B | 1456.53 | **308.21** | 86.96 | 67.77 | **57.84** |
72
  | Vicuna-7B | **1481.12** | 302.85 | **87.174** | **67.97** | 56.49 |
73
 
74
+ **Table 3:** Comparison of different LLM backbones on visual language understanding benchmarks. All models are instruction-tuned on the general domain data (i.e. LLaVA)*
75
 
76
 
77
 
 
103
  | VG [25] | 86K | Provide the bounding box coordinate of the region this sentence describes. |
104
  | **Total** | **665K** | |
105
 
106
+ **Table 4:** Instruction-following Data Mixture of LLaVA-1.5.*
107
 
108
  #### Web2Code Data
109
 
 
130
  | **Avg DOM Depth** | 5.3±1.0 | 6.5±1.0 |
131
  | **Avg Unique Tags** | 13.6±2.7 | 13.5±2.5 |
132
 
133
+ **Table 5:** DWCG is a newly generated GPT-3.5-based dataset, while DWCG<sub>R</sub> is the refined dataset that utilizes WebSight and Pix2Code datasets*
134
 
135
 
136
  ### Webpage Understanding Datasets
 
140
  | **Instruction** | ✓ | ✓ |
141
  | **Size** | 243.5K | 51.5K |
142
 
143
+ **Table 6:** Distribution of DWU and DWU<sub>R</sub> datasets. Both datasets include high-quality question-answer pairs for webpage understanding.*
144
 
145
 
146