File size: 4,978 Bytes
d2e787d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
---
license: mit
language:
- multilingual
pipeline_tag: image-text-to-text
tags:
- nlp
- vision
- internvl
base_model:
- OpenGVLab/InternVL2-2B
base_model_relation: quantized
---
# InternVL2-2B-int4-ov
* Model creator: [OpenGVLab](https://huggingface.co./OpenGVLab)
* Original model: [InternVL2-2B](https://huggingface.co./OpenGVLab/InternVL2-2B)
## Description
This is [OpenGVLab/InternVL2-2B](https://huggingface.co./OpenGVLab/InternVL2-2B) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT4 using Activation Aware Quantization (AWQ) by [NNCF](https://github.com/openvinotoolkit/nncf).
## Quantization Parameters
Weight compression was performed using `nncf.compress_weights` with the following parameters:
* mode: **INT4_ASYM**
* ratio: **1.0**
* group_size: **128**
* awq: **True**
* dataset: **[contextual](https://huggingface.co./datasets/ucla-contextual/contextual_test)**
* num_samples: **32**
## Compatibility
The provided OpenVINO™ IR model is compatible with:
* OpenVINO version 2025.0.0 and higher
* Optimum Intel 1.21.0 and higher
## Running Model Inference with [Optimum Intel](https://huggingface.co./docs/optimum/intel/index)
1. Install packages required for using [Optimum Intel](https://huggingface.co./docs/optimum/intel/index) integration with the OpenVINO backend:
```
pip install --pre -U --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release openvino_tokenizers openvino
pip install git+https://github.com/huggingface/optimum-intel.git
```
2. Run model inference
```
from PIL import Image
import requests
from optimum.intel.openvino import OVModelForVisualCausalLM
from transformers import AutoTokenizer, TextStreamer
model_id = "OpenVINO/InternVL2-2B-int4-ov"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
ov_model = OVModelForVisualCausalLM.from_pretrained(model_id, trust_remote_code=True)
prompt = "What is unusual on this picture?"
url = "https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/d5fbbd1a-d484-415c-88cb-9986625b7b11"
image = Image.open(requests.get(url, stream=True).raw)
inputs = ov_model.preprocess_inputs(text=prompt, image=image, tokenizer=tokenizer, config=ov_model.config)
generation_args = {
"max_new_tokens": 100,
"streamer": TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
}
generate_ids = ov_model.generate(**inputs, **generation_args)
generate_ids = generate_ids[:, inputs['input_ids'].shape[1]:]
response = tokenizer.batch_decode(generate_ids, skip_special_tokens=True)[0]
```
## Running Model Inference with [OpenVINO GenAI](https://github.com/openvinotoolkit/openvino.genai)
1. Install packages required for using OpenVINO GenAI.
```
pip install --pre -U --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/pre-release openvino openvino-tokenizers openvino-genai
pip install huggingface_hub
```
2. Download model from HuggingFace Hub
```
import huggingface_hub as hf_hub
model_id = "OpenVINO/InternVL2-2B-int4-ov"
model_path = "InternVL2-2B-int4-ov"
hf_hub.snapshot_download(model_id, local_dir=model_path)
```
1. Run model inference:
```
import openvino_genai as ov_genai
import requests
from PIL import Image
from io import BytesIO
import numpy as np
import openvino as ov
device = "CPU"
pipe = ov_genai.VLMPipeline(model_path, device)
def load_image(image_file):
if isinstance(image_file, str) and (image_file.startswith("http") or image_file.startswith("https")):
response = requests.get(image_file)
image = Image.open(BytesIO(response.content)).convert("RGB")
else:
image = Image.open(image_file).convert("RGB")
image_data = np.array(image.getdata()).reshape(1, image.size[1], image.size[0], 3).astype(np.byte)
return ov.Tensor(image_data)
prompt = "What is unusual on this picture?"
url = "https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/d5fbbd1a-d484-415c-88cb-9986625b7b11"
image_tensor = load_image(url)
def streamer(subword: str) -> bool:
print(subword, end="", flush=True)
return False
pipe.start_chat()
output = pipe.generate(prompt, image=image_tensor, max_new_tokens=100, streamer=streamer)
pipe.start_chat()
```
More GenAI usage examples can be found in OpenVINO GenAI library [docs](https://github.com/openvinotoolkit/openvino.genai/blob/master/src/README.md) and [samples](https://github.com/openvinotoolkit/openvino.genai?tab=readme-ov-file#openvino-genai-samples)
## Limitations
Check the original [model card](https://huggingface.co./OpenGVLab/InternVL2-2B) for limitations.
## Legal information
The original model is distributed under [MIT](https://huggingface.co./datasets/choosealicense/licenses/blob/main/markdown/mit.md) license. More details can be found in [original model card](https://huggingface.co./OpenGVLab/InternVL2-2B).
|