shenyunhang commited on
Commit
0950c7d
·
verified ·
1 Parent(s): ec8ca7e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -6
README.md CHANGED
@@ -2,6 +2,8 @@
2
  license: apache-2.0
3
  datasets:
4
  - VITA-MLLM/Long-VITA-Training-Data
 
 
5
  ---
6
 
7
 
@@ -14,21 +16,26 @@ Long-VITA is a strong long-context visual language model and supports more than
14
 
15
  - This weight is trained on Ascend NPU with MindSpeed.
16
 
17
- - To infer and evaluate on Nvidia GPU, we also implement Long-VITA on Megatron with Transformer Engine.
18
-
19
- - The converted weight is in https://huggingface.co/VITA-MLLM/Long-VITA-128K_MG.
20
 
21
 
22
  ## 📈 Experimental Results
23
  - **Comparison of image understanding**.
24
 
25
- ![image](https://github.com/user-attachments/assets/30f62f51-675e-4dac-9f18-f743c311f9be)
26
-
27
 
28
 
29
  - **Comparison of video understanding**.
30
 
31
- ![image](https://github.com/user-attachments/assets/01892ff3-cdcd-4d15-ad6d-5cc99ccbfa70)
 
 
 
 
 
 
 
32
 
33
 
34
 
 
2
  license: apache-2.0
3
  datasets:
4
  - VITA-MLLM/Long-VITA-Training-Data
5
+ base_model:
6
+ - VITA-MLLM/Long-VITA-16K
7
  ---
8
 
9
 
 
16
 
17
  - This weight is trained on Ascend NPU with MindSpeed.
18
 
19
+ - To infer and evaluate on Nvidia GPUs, we also implemented Long-VITA on Megatron with the Transformer Engine. The converted weight is in https://huggingface.co/VITA-MLLM/Long-VITA-128K_MG.
 
 
20
 
21
 
22
  ## 📈 Experimental Results
23
  - **Comparison of image understanding**.
24
 
25
+ ![image](https://github.com/user-attachments/assets/235bdb0e-37e6-4a5f-b20b-21b0bb83278a)
26
+ ![image](https://github.com/user-attachments/assets/72250c5b-7d33-4dba-98ab-0539bae08703)
27
 
28
 
29
  - **Comparison of video understanding**.
30
 
31
+ ![image](https://github.com/user-attachments/assets/7f09662b-bd53-4504-927a-0e45214a049d)
32
+
33
+ ![image](https://github.com/user-attachments/assets/87bd2f4d-baf5-4a63-8002-151e30f52147)
34
+
35
+
36
+ - **Effectiveness of Logits-Masked LM Head**.
37
+
38
+ ![image](https://github.com/user-attachments/assets/7a06b4dd-267c-470f-80f2-d26c87e23460)
39
 
40
 
41