--- license: apache-2.0 language: - en pipeline_tag: image-text-to-text tags: - llava - vision --- ## llava: computer vision GGUF - Clip Handler > mmproj-f16.gguf ([624MB](https://huggingface.co./calcuis/llava-gguf/blob/main/mmproj-f16.gguf)) - Vision Model > llava-***.gguf (opt anyone you like) - provide a link of picture as prompt - it will probably give hints to you for complying your text-to-image prompt (reverse engineering) 🎃 ## how to use - compatible with [gguf-connector](https://pypi.org/project/gguf-connector/) (ggc v) 👻 - see initial test examples (with just Q2_K - [2.53GB](https://huggingface.co./calcuis/llava-gguf/blob/main/llava-q2_k.gguf)) below ### prompt test#1 ![screenshot](https://raw.githubusercontent.com/calcuis/comfy/master/sd3.5g.png) prompt picture URL: https://raw.githubusercontent.com/calcuis/comfy/master/sd3.5g.png response: "The image features a computer screen displaying a cartoon of a woman with a fox head, sitting at a dining table. The table is set with a cake, a knife, and a cup, giving the impression that the woman is about to enjoy a meal. The cartoon is displayed in a graphic style, making it visually appealing and engaging." ### prompt test#2 prompt picture URL: https://raw.githubusercontent.com/calcuis/suko-puzzle/master/demo.png response: "The image shows a close-up of a cell phone screen, displaying a game board with a puzzle. The game board is filled with numbers, and there are four different colored circles surrounding the puzzle. The game appears to be a popular one, as the image is taken from the perspective of someone looking at the screen." ### prompt test#3 prompt picture URL: https://raw.githubusercontent.com/calcuis/gguf-connector/master/gguf.gif response: "The image features a green cartoon character with a mouth wide open, appearing to be yawning. The character has a toothy smile and is wearing a pair of yellow gloves. The overall scene is a colorful and fun representation of the green character." feedback from connector: "Awesome!"