yuwenz commited on
Commit
33749c1
·
1 Parent(s): e3069f5

upload int8 onnx model

Browse files

Signed-off-by: yuwenzho <[email protected]>

Files changed (2) hide show
  1. README.md +24 -1
  2. model.onnx +3 -0
README.md CHANGED
@@ -14,7 +14,9 @@ metrics:
14
 
15
  # INT8 MiniLM finetuned MRPC
16
 
17
- ### Post-training static quantization
 
 
18
 
19
  This is an INT8 PyTorch model quantized with [huggingface/optimum-intel](https://github.com/huggingface/optimum-intel) through the usage of [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
20
  The original fp32 model comes from the fine-tuned model [Intel/MiniLM-L12-H384-uncased-mrpc](https://huggingface.co/Intel/MiniLM-L12-H384-uncased-mrpc).
@@ -38,3 +40,24 @@ int8_model = IncQuantizedModelForSequenceClassification(
38
  'Intel/MiniLM-L12-H384-uncased-mrpc-int8-static',
39
  )
40
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  # INT8 MiniLM finetuned MRPC
16
 
17
+ ## Post-training static quantization
18
+
19
+ ### PyTorch
20
 
21
  This is an INT8 PyTorch model quantized with [huggingface/optimum-intel](https://github.com/huggingface/optimum-intel) through the usage of [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
22
  The original fp32 model comes from the fine-tuned model [Intel/MiniLM-L12-H384-uncased-mrpc](https://huggingface.co/Intel/MiniLM-L12-H384-uncased-mrpc).
 
40
  'Intel/MiniLM-L12-H384-uncased-mrpc-int8-static',
41
  )
42
  ```
43
+
44
+ ### ONNX
45
+
46
+ This is an INT8 ONNX model quantized with [Intel® Neural Compressor](https://github.com/intel/neural-compressor).
47
+
48
+ The original fp32 model comes from the fine-tuned model [Intel/MiniLM-L12-H384-uncased-mrpc](https://huggingface.co/Intel/MiniLM-L12-H384-uncased-mrpc).
49
+
50
+ #### Test result
51
+
52
+ | |INT8|FP32|
53
+ |---|:---:|:---:|
54
+ | **Accuracy (eval-f1)** |0.9137|0.9097|
55
+ | **Model size (MB)** |120|128|
56
+
57
+
58
+ #### Load ONNX model:
59
+
60
+ ```python
61
+ from optimum.onnxruntime import ORTModelForSequenceClassification
62
+ model = ORTModelForSequenceClassification.from_pretrained('Intel/MiniLM-L12-H384-uncased-mrpc-int8-static')
63
+ ```
model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6d0e88478db4ecf4607bf9f2780a44758f292f18e409e6abcb2a150ffb97d482
3
+ size 125535210