OPEA
/

GGUF
Inference Endpoints
conversational
cicdatopea commited on
Commit
0c4f21a
·
verified ·
1 Parent(s): e6d74f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -7,7 +7,7 @@ base_model:
7
 
8
  ## Model Details
9
 
10
- This awq model is an int4 model with group_size 32 and symmetric quantization of [SmallThinker-3B-Preview](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview) generated by [intel/auto-round](https://github.com/intel/auto-round).
11
 
12
  ## How To Use
13
  ### Requirements
@@ -111,11 +111,10 @@ text="How many r in strawberry."
111
 
112
  ### Generate the model
113
 
114
- Here is the sample command to generate the model. For symmetric quantization, we found overflow/NAN will occur for some backends, so better fallback some layers. auto_round requires version >0.4.1
115
-
116
  ```bash
117
  auto-round \
118
- --model QSmallThinker-3B-Preview \
119
  --device 0 \
120
  --group_size 32 \
121
  --bits 4 \
 
7
 
8
  ## Model Details
9
 
10
+ This awq model is an int4 model with group_size 32 and symmetric quantization of [PowerInfer/SmallThinker-3B-Preview](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview) generated by [intel/auto-round](https://github.com/intel/auto-round).
11
 
12
  ## How To Use
13
  ### Requirements
 
111
 
112
  ### Generate the model
113
 
114
+ Here is the sample command to generate the model.
 
115
  ```bash
116
  auto-round \
117
+ --model PowerInfer/SmallThinker-3B-Preview \
118
  --device 0 \
119
  --group_size 32 \
120
  --bits 4 \