katuni4ka commited on
Commit
e7da494
1 Parent(s): 40760e5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -12
README.md CHANGED
@@ -3,37 +3,43 @@ license: apache-2.0
3
  language:
4
  - en
5
  ---
 
6
  # Mixtral-8x7b-Instruct-v0.1-int4-ov
 
7
  * Model creator: [Mistral AI](https://huggingface.co/mistralai)
8
  * Original model: [Mixtral 8X7B Instruct v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
9
 
10
  ## Description
11
- This is [Mixtral-8x7b-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) model converted to [OpenVINO](https://docs.openvino.ai/2024/home.html) Intermediate Representation (IR) format with INT4 compressed weights using [NNCF](https://github.com/openvinotoolkit/nncf).
12
 
13
- ## Quantization Configuration
14
 
15
- Model weights was compressed to INT4 precision using `nncf.compress_weights` with the following parameters:
 
 
16
 
17
  * mode: **INT4_SYM**
18
  * group_size: **128**
19
  * ratio: **0.8**
20
 
21
- More details about optimization parameters can be found in [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html)
22
 
23
  ## Compatibility
24
 
25
- This provided IR is compatible with openvino starting with 2024.0.0 version and optimum-intel 1.16.0
 
 
 
26
 
27
- ## Usage
28
 
29
- ### Install required packages
30
 
31
- To install the required components for using [Optimum Intel integration](https://huggingface.co/docs/optimum/intel/index) with the OpenVINO backend, do:
32
  ```
33
  pip install optimum[openvino]
34
  ```
35
 
36
- ### Run model inference
 
37
  ```
38
  from transformers import AutoTokenizer
39
  from optimum.intel.openvino import OVModelForCausalLM
@@ -55,8 +61,12 @@ outputs = model.generate(inputs, max_new_tokens=20)
55
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
56
  ```
57
 
58
- For more examples and possible optimizations please refer [OpenVINO Large Language Model Inference Guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html)
 
 
 
 
59
 
60
- ### Limitations
61
 
62
- Please check original model card for model usage [limitations](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1#limitations)
 
3
  language:
4
  - en
5
  ---
6
+
7
  # Mixtral-8x7b-Instruct-v0.1-int4-ov
8
+
9
  * Model creator: [Mistral AI](https://huggingface.co/mistralai)
10
  * Original model: [Mixtral 8X7B Instruct v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
11
 
12
  ## Description
 
13
 
14
+ This is [Mixtral-8x7b-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT4 by [NNCF](https://github.com/openvinotoolkit/nncf).
15
 
16
+ ## Quantization Parameters
17
+
18
+ Weight compression was performed using `nncf.compress_weights` with the following parameters:
19
 
20
  * mode: **INT4_SYM**
21
  * group_size: **128**
22
  * ratio: **0.8**
23
 
24
+ For more information on quantization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html).
25
 
26
  ## Compatibility
27
 
28
+ The provided OpenVINO™ IR model is compatible with:
29
+
30
+ * OpenVINO version 2024.0.0 and higher
31
+ * Optimum Intel 1.16.0 and higher
32
 
33
+ ## Running Model Inference
34
 
35
+ 1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
36
 
 
37
  ```
38
  pip install optimum[openvino]
39
  ```
40
 
41
+ 2. Run model inference:
42
+
43
  ```
44
  from transformers import AutoTokenizer
45
  from optimum.intel.openvino import OVModelForCausalLM
 
61
  print(tokenizer.decode(outputs[0], skip_special_tokens=True))
62
  ```
63
 
64
+ For more examples and possible optimizations, refer to the [OpenVINO Large Language Model Inference Guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html).
65
+
66
+ ## Limitations
67
+
68
+ Check the original model card for [limitations](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1#limitations).
69
 
70
+ ## Legal information
71
 
72
+ The original model is distributed under [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) license. More details can be found in [original model card](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).