---
license: apache-2.0
language:
- en
pipeline_tag: image-text-to-text
tags:
- llava
- vision
---
## llava: computer vision GGUF
- Clip Handler > mmproj-f16.gguf ([624MB](https://huggingface.co./calcuis/llava-gguf/blob/main/mmproj-f16.gguf))
- Vision Model > llava-***.gguf (opt anyone you like)
- provide a link of picture as prompt
- it will probably give hints to you for complying your text-to-image prompt (reverse engineering) 🎃
## how to use
- compatible with [gguf-connector](https://pypi.org/project/gguf-connector/) (ggc v) 👻
- see initial test examples (with just Q2_K - [2.53GB](https://huggingface.co./calcuis/llava-gguf/blob/main/llava-q2_k.gguf)) below
### prompt test#1
![screenshot](https://raw.githubusercontent.com/calcuis/comfy/master/sd3.5g.png)
prompt picture URL: https://raw.githubusercontent.com/calcuis/comfy/master/sd3.5g.png
response: "The image features a computer screen displaying a cartoon of a woman with a fox head, sitting at a dining table. The table is set with a cake, a knife, and
a cup, giving the impression that the woman is about to enjoy a meal. The cartoon is displayed in a graphic style, making it visually appealing and
engaging."
### prompt test#2
prompt picture URL: https://raw.githubusercontent.com/calcuis/suko-puzzle/master/demo.png
response: "The image shows a close-up of a cell phone screen, displaying a game board with a puzzle. The game board is filled with numbers, and there are four
different colored circles surrounding the puzzle. The game appears to be a popular one, as the image is taken from the perspective of someone looking at the
screen."
### prompt test#3
prompt picture URL: https://raw.githubusercontent.com/calcuis/gguf-connector/master/gguf.gif
response: "The image features a green cartoon character with a mouth wide open, appearing to be yawning. The character has a toothy smile and is wearing a pair of
yellow gloves. The overall scene is a colorful and fun representation of the green character."
feedback from connector: "Awesome!"