tuanio commited on
Commit
b55cd9d
1 Parent(s): 11804bf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -8,6 +8,14 @@ library_name: adapter-transformers
8
  pipeline_tag: text-classification
9
  ---
10
 
 
 
 
 
 
 
 
 
11
  # Introducing MoE-LLaVA-Qwen1.5-1.8B×4-Top2 for Vietnamese
12
 
13
  We are excited to present MoE-LLaVA-Qwen1.5-1.8B×4-Top2, tailored for the Vietnamese language. This model is part of our ongoing efforts to develop Vision Language Models (VLM) for Vietnamese, a domain that is currently limited and predominantly features larger models (~7B parameters). Our model activates approximately 2.2B parameters per call, significantly reducing the memory footprint, and it can be quantized for local execution.
@@ -27,6 +35,9 @@ For the COCO dataset, we utilized Llava-style prompts to generate data. For the
27
  - **Caption-based Prompting:** Utilizes accurate captions and bounding boxes from the original dataset.
28
  - **Image-based Prompting:** Leverages images to generate captions and conversations.
29
 
 
 
 
30
  ## Bias, Risks, and Limitations
31
 
32
  The dataset may contain biases originating from its sources. Users should remain aware of these potential biases when utilizing the dataset.
 
8
  pipeline_tag: text-classification
9
  ---
10
 
11
+ <p align="center">
12
+ <img src="https://s11.ax1x.com/2023/12/28/piqvDMV.png" width="250" style="margin-bottom: 0.2;"/>
13
+ <img src="https://s11.ax1x.com/2023/12/28/piqvDMV.png" width="250" style="margin-bottom: 0.2;"/>
14
+ <p>
15
+ <h2 align="center"> <a href="https://arxiv.org/abs/2401.15947">MoE-LLaVA-Qwen1.5-1.8B×4-Top2: When Vision meet Small-scaled Language Model and Vietnamese Synthetic Dataset</a></h2>
16
+
17
+ <h5 align="center">
18
+
19
  # Introducing MoE-LLaVA-Qwen1.5-1.8B×4-Top2 for Vietnamese
20
 
21
  We are excited to present MoE-LLaVA-Qwen1.5-1.8B×4-Top2, tailored for the Vietnamese language. This model is part of our ongoing efforts to develop Vision Language Models (VLM) for Vietnamese, a domain that is currently limited and predominantly features larger models (~7B parameters). Our model activates approximately 2.2B parameters per call, significantly reducing the memory footprint, and it can be quantized for local execution.
 
35
  - **Caption-based Prompting:** Utilizes accurate captions and bounding boxes from the original dataset.
36
  - **Image-based Prompting:** Leverages images to generate captions and conversations.
37
 
38
+ ## Evaluation
39
+ - Comming soon 🫡
40
+
41
  ## Bias, Risks, and Limitations
42
 
43
  The dataset may contain biases originating from its sources. Users should remain aware of these potential biases when utilizing the dataset.