sidrajaram
/

Qwen2-VL-2B-Instruct-GGUF

Image-Text-to-Text

Inference Endpoints

Model card Files Files and versions Community

Qwen2-VL-2B-Instruct-GGUF / README.md

sidrajaram's picture

corrected base model

1b04741 verified about 2 months ago

|

history blame contribute delete

1.54 kB

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen2-VL-2B-Instruct
	pipeline_tag: image-text-to-text
	language:
	- en
	tags:
	- multimodal
	- conversational
	---

	# Qwen2-VL-2B-Instruct-GGUF (f16)
	This is a F16 GGUF version of [Qwen2-VL-2B-Instruct](https://huggingface.co./Qwen/Qwen2-VL-2B-Instruct) for use with llama.cpp (i.e. you can run Qwen2-VL on your Mac)

	# How to Use
	1. Build `llama-qwen2vl-cli` executable
	2. Download model files and use: `./llama-qwen2vl-cli -m Qwen2-VL-2B-Instruct-F16.gguf --mmproj qwen2-vl-2b-instruct-vision.gguf -p "Describe this image." --image crocodiles.png`

	## Details on Usage:
	1. Download the model files from this repository (sidrajaram/Qwen2-VL-2B-Instruct-GGUF).

	2. Make sure you have [llama.cpp](https://github.com/ggerganov/llama.cpp) and have built the `llama-qwen2vl-cli` executable.
	```
	git clone https://github.com/ggerganov/llama.cpp.git
	```
	For example, building with CMake (see detailed llama.cpp build instructions: https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md):
	```
	cmake -B build
	cmake --build build --config Release
	```
	3. Run
	```
	./path/to/llama-qwen2vl-cli -m path/to/Qwen2-VL-2B-Instruct-F16.gguf --mmproj path/to/qwen2-vl-2b-instruct-vision.gguf -p "Describe this image." --image path/to/image.png
	```
	Note: According to llama.cpp contributors, "it's recommended to resize the image to a resolution below 640x640, so it won't take forever to run on CPU backend"


	Credit to the original model: https://huggingface.co./Qwen/Qwen2-VL-2B-Instruct