TheBloke
/

vicuna-7B-v0-GPTQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

TheBloke commited on Aug 21, 2023

Commit

a827c03

•

1 Parent(s): 63d1461

Update README.md

Files changed (1) hide show

README.md +1 -4

README.md CHANGED Viewed

@@ -15,6 +15,7 @@ inference: false
     </div>
 </div>
 <!-- header end -->
 # Vicuna 7B GPTQ 4-bit 128g
 This repository contains the [Vicuna 7B model](https://huggingface.co/lmsys/vicuna-7b-delta-v0) quantised using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
@@ -26,10 +27,6 @@ The original Vicuna 7B repository contains deltas rather than weights. Rather th
 Two model files are provided. You don't need both, choose the one you prefer.
 Details of the files provided:
-* `vicuna-7B-GPTQ-4bit-128g.pt`
-  * pt format file, created with the latest [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) code.
-  * Command to create:
-    * `python3 llama.py vicuna-7B c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save vicuna-7B-GPTQ-4bit-128g.pt`
 * `vicuna-7B-GPTQ-4bit-128g.safetensors`
   * newer `safetensors` format, with improved file security, created with the latest [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) code.
   * Command to create:

     </div>
 </div>
 <!-- header end -->
 # Vicuna 7B GPTQ 4-bit 128g
 This repository contains the [Vicuna 7B model](https://huggingface.co/lmsys/vicuna-7b-delta-v0) quantised using [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).
 Two model files are provided. You don't need both, choose the one you prefer.
 Details of the files provided:
 * `vicuna-7B-GPTQ-4bit-128g.safetensors`
   * newer `safetensors` format, with improved file security, created with the latest [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) code.
   * Command to create: