OPEA
/

SmallThinker-3B-Preview-int4-sym-gguf-q4-0-inc

Inference Endpoints

Model card Files Files and versions Community

cicdatopea commited on 5 days ago

Commit

0c4f21a

·

verified ·

1 Parent(s): e6d74f0

Update README.md

Files changed (1) hide show

README.md +3 -4

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ base_model:
 ## Model Details
-This awq model is an int4 model with group_size 32 and symmetric quantization of [SmallThinker-3B-Preview](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview) generated by [intel/auto-round](https://github.com/intel/auto-round).
 ## How To Use
 ### Requirements
@@ -111,11 +111,10 @@ text="How many r in strawberry."
 ### Generate the model
-Here is the sample command to generate the model. For symmetric quantization, we found overflow/NAN will occur for some backends, so better fallback some layers. auto_round requires version >0.4.1
 ```bash
 auto-round \
---model  QSmallThinker-3B-Preview \
 --device 0 \
 --group_size 32 \
 --bits 4 \

 ## Model Details
+This awq model is an int4 model with group_size 32 and symmetric quantization of [PowerInfer/SmallThinker-3B-Preview](https://huggingface.co/PowerInfer/SmallThinker-3B-Preview) generated by [intel/auto-round](https://github.com/intel/auto-round).
 ## How To Use
 ### Requirements
 ### Generate the model
+Here is the sample command to generate the model.
 ```bash
 auto-round \
+--model  PowerInfer/SmallThinker-3B-Preview \
 --device 0 \
 --group_size 32 \
 --bits 4 \