DAMO-NLP-SG
/

VideoLLaMA3-7B

Visual Question Answering

videollama3_qwen2

text-generation

large-language-model

video-language-model

Model card Files Files and versions Community

Add pipeline tag

#1

by nielsr HF staff - opened about 8 hours ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +10 -3

README.md CHANGED Viewed

@@ -15,13 +15,12 @@ language:
 - en
 metrics:
 - accuracy
-pipeline_tag: visual-question-answering
 base_model:
 - Qwen/Qwen2.5-7B-Instruct
 - DAMO-NLP-SG/VideoLLaMA3-7B-Image
 ---
 <p align="center">
     <img src="https://cdn-uploads.huggingface.co/production/uploads/626938b16f8f86ad21deb989/tt5KYnAUmQlHtfB1-Zisl.png" width="150" style="margin-bottom: 0.2;"/>
 <p>
@@ -139,4 +138,12 @@ If you find VideoLLaMA useful for your research and applications, please cite us
   year = {2023},
   url = {https://arxiv.org/abs/2306.02858}
 }
-```

 - en
 metrics:
 - accuracy
+pipeline_tag: any-to-any
 base_model:
 - Qwen/Qwen2.5-7B-Instruct
 - DAMO-NLP-SG/VideoLLaMA3-7B-Image
 ---
 <p align="center">
     <img src="https://cdn-uploads.huggingface.co/production/uploads/626938b16f8f86ad21deb989/tt5KYnAUmQlHtfB1-Zisl.png" width="150" style="margin-bottom: 0.2;"/>
 <p>
   year = {2023},
   url = {https://arxiv.org/abs/2306.02858}
 }
+```
+## 👍 Acknowledgement
+Our VideoLLaMA3 is built on top of [**SigLip**](https://huggingface.co/google/siglip-so400m-patch14-384) and [**Qwen2.5**](https://github.com/QwenLM/Qwen2.5). We also learned a lot from the implementation of [**LLaVA-OneVision**](https://github.com/LLaVA-VL/LLaVA-NeXT), [**InternVL2**](https://internvl.github.io/blog/2024-07-02-InternVL-2.0/), and [**Qwen2VL**](https://github.com/QwenLM/Qwen2-VL). Besides, our VideoLLaMA3 benefits from tons of open-source efforts. We sincerely appreciate these efforts and compile a list in [ACKNOWLEDGEMENT.md](https://github.com/DAMO-NLP-SG/VideoLLaMA3/blob/main/ACKNOWLEDGEMENT.md) to express our gratitude. If your work is used in VideoLLaMA3 but not mentioned in either this repo or the technical report, feel free to let us know :heart:.
+## 🔒 License
+This project is released under the Apache 2.0 license as found in the LICENSE file.
+The service is a research preview intended for **non-commercial use ONLY**, subject to the model Licenses of Qwen, Terms of Use of the data generated by OpenAI and Gemini, and Privacy Practices of ShareGPT. Please get in touch with us if you find any potential violations.