tuanio
/

ft-moellava-qwen1.5-1.8b-vista-lora-2ep

Text Classification

Model card Files Files and versions Community

tuanio commited on Jul 20

Commit

b55cd9d

•

1 Parent(s): 11804bf

Update README.md

Files changed (1) hide show

README.md +11 -0

README.md CHANGED Viewed

@@ -8,6 +8,14 @@ library_name: adapter-transformers
 pipeline_tag: text-classification
 ---
 # Introducing MoE-LLaVA-Qwen1.5-1.8B×4-Top2 for Vietnamese
 We are excited to present MoE-LLaVA-Qwen1.5-1.8B×4-Top2, tailored for the Vietnamese language. This model is part of our ongoing efforts to develop Vision Language Models (VLM) for Vietnamese, a domain that is currently limited and predominantly features larger models (~7B parameters). Our model activates approximately 2.2B parameters per call, significantly reducing the memory footprint, and it can be quantized for local execution.
@@ -27,6 +35,9 @@ For the COCO dataset, we utilized Llava-style prompts to generate data. For the
 - **Caption-based Prompting:** Utilizes accurate captions and bounding boxes from the original dataset.
 - **Image-based Prompting:** Leverages images to generate captions and conversations.
 ## Bias, Risks, and Limitations
 The dataset may contain biases originating from its sources. Users should remain aware of these potential biases when utilizing the dataset.

 pipeline_tag: text-classification
 ---
+<p align="center">
+    <img src="https://s11.ax1x.com/2023/12/28/piqvDMV.png" width="250" style="margin-bottom: 0.2;"/>
+    <img src="https://s11.ax1x.com/2023/12/28/piqvDMV.png" width="250" style="margin-bottom: 0.2;"/>
+<p>
+<h2 align="center"> <a href="https://arxiv.org/abs/2401.15947">MoE-LLaVA-Qwen1.5-1.8B×4-Top2: When Vision meet Small-scaled Language Model and Vietnamese Synthetic Dataset</a></h2>
+<h5 align="center">
 # Introducing MoE-LLaVA-Qwen1.5-1.8B×4-Top2 for Vietnamese
 We are excited to present MoE-LLaVA-Qwen1.5-1.8B×4-Top2, tailored for the Vietnamese language. This model is part of our ongoing efforts to develop Vision Language Models (VLM) for Vietnamese, a domain that is currently limited and predominantly features larger models (~7B parameters). Our model activates approximately 2.2B parameters per call, significantly reducing the memory footprint, and it can be quantized for local execution.
 - **Caption-based Prompting:** Utilizes accurate captions and bounding boxes from the original dataset.
 - **Image-based Prompting:** Leverages images to generate captions and conversations.
+## Evaluation
+- Comming soon 🫡
 ## Bias, Risks, and Limitations
 The dataset may contain biases originating from its sources. Users should remain aware of these potential biases when utilizing the dataset.