TheBloke commited on
Commit
6b2ed28
1 Parent(s): 7a1cec0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -32,7 +32,7 @@ pipeline_tag: text-generation
32
 
33
  ## Description
34
 
35
- This repo contains GPTQ model files for [Stability AI's FreeWilly 2](https://huggingface.co/stabilityai/FreeWilly2).
36
 
37
  Multiple GPTQ parameter permutations are provided; see Provided Files below for details of the options provided, their parameters, and the software used to create them.
38
 
@@ -76,10 +76,10 @@ Each separate quant is in a different branch. See below for instructions on fet
76
 
77
  ## How to download from branches
78
 
79
- - In text-generation-webui, you can add `:branch` to the end of the download name, eg `TheBloke/FreeWilly2-GPTQ:gptq-4bit-32g-actorder_True`
80
  - With Git, you can clone a branch with:
81
  ```
82
- git clone --branch gptq-4bit-32g-actorder_True https://huggingface.co/TheBloke/FreeWilly2-GPTQ`
83
  ```
84
  - In Python Transformers code, the branch is the `revision` parameter; see below.
85
 
@@ -90,13 +90,13 @@ Please make sure you're using the latest version of [text-generation-webui](http
90
  It is strongly recommended to use the text-generation-webui one-click-installers unless you know how to make a manual install.
91
 
92
  1. Click the **Model tab**.
93
- 2. Under **Download custom model or LoRA**, enter `TheBloke/FreeWilly2-GPTQ`.
94
- - To download from a specific branch, enter for example `TheBloke/FreeWilly2-GPTQ:gptq-4bit-32g-actorder_True`
95
  - see Provided Files above for the list of branches for each option.
96
  3. Click **Download**.
97
  4. The model will start downloading. Once it's finished it will say "Done"
98
  5. In the top left, click the refresh icon next to **Model**.
99
- 6. In the **Model** dropdown, choose the model you just downloaded: `FreeWilly2-GPTQ`
100
  7. The model will automatically load, and is now ready for use!
101
  8. If you want any custom settings, set them and then click **Save settings for this model** followed by **Reload the Model** in the top right.
102
  * Note that you do not need to set GPTQ parameters any more. These are set automatically from the file `quantize_config.json`.
@@ -104,7 +104,7 @@ It is strongly recommended to use the text-generation-webui one-click-installers
104
 
105
  ## How to use this GPTQ model from Python code
106
 
107
- First make sure you have [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) installed:
108
 
109
  `GITHUB_ACTIONS=true pip install auto-gptq`
110
 
@@ -114,7 +114,7 @@ Then try the following example code:
114
  from transformers import AutoTokenizer, pipeline, logging
115
  from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
116
 
117
- model_name_or_path = "TheBloke/FreeWilly2-GPTQ"
118
  model_basename = "gptq_model-4bit--1g"
119
 
120
  use_triton = False
@@ -123,7 +123,7 @@ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
123
 
124
  model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
125
  model_basename=model_basename,
126
- inject_fused_attention=False, # Required for TheBloke/FreeWilly2-GPTQ model at this time.
127
  use_safetensors=True,
128
  trust_remote_code=False,
129
  device="cuda:0",
 
32
 
33
  ## Description
34
 
35
+ This repo contains GPTQ model files for [Stability AI's StableBeluga 2](https://huggingface.co/stabilityai/StableBeluga2).
36
 
37
  Multiple GPTQ parameter permutations are provided; see Provided Files below for details of the options provided, their parameters, and the software used to create them.
38
 
 
76
 
77
  ## How to download from branches
78
 
79
+ - In text-generation-webui, you can add `:branch` to the end of the download name, eg `TheBloke/StableBeluga2-GPTQ:gptq-4bit-32g-actorder_True`
80
  - With Git, you can clone a branch with:
81
  ```
82
+ git clone --branch gptq-4bit-32g-actorder_True https://huggingface.co/TheBloke/StableBeluga2-GPTQ`
83
  ```
84
  - In Python Transformers code, the branch is the `revision` parameter; see below.
85
 
 
90
  It is strongly recommended to use the text-generation-webui one-click-installers unless you know how to make a manual install.
91
 
92
  1. Click the **Model tab**.
93
+ 2. Under **Download custom model or LoRA**, enter `TheBloke/StableBeluga2-GPTQ`.
94
+ - To download from a specific branch, enter for example `TheBloke/StableBeluga2-GPTQ:gptq-4bit-32g-actorder_True`
95
  - see Provided Files above for the list of branches for each option.
96
  3. Click **Download**.
97
  4. The model will start downloading. Once it's finished it will say "Done"
98
  5. In the top left, click the refresh icon next to **Model**.
99
+ 6. In the **Model** dropdown, choose the model you just downloaded: `StableBeluga2-GPTQ`
100
  7. The model will automatically load, and is now ready for use!
101
  8. If you want any custom settings, set them and then click **Save settings for this model** followed by **Reload the Model** in the top right.
102
  * Note that you do not need to set GPTQ parameters any more. These are set automatically from the file `quantize_config.json`.
 
104
 
105
  ## How to use this GPTQ model from Python code
106
 
107
+ First make sure you have [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ) 0.3.2 or later installed:
108
 
109
  `GITHUB_ACTIONS=true pip install auto-gptq`
110
 
 
114
  from transformers import AutoTokenizer, pipeline, logging
115
  from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
116
 
117
+ model_name_or_path = "TheBloke/StableBeluga2-GPTQ"
118
  model_basename = "gptq_model-4bit--1g"
119
 
120
  use_triton = False
 
123
 
124
  model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
125
  model_basename=model_basename,
126
+ inject_fused_attention=False, # Required for Llama 2 70B models at this time.
127
  use_safetensors=True,
128
  trust_remote_code=False,
129
  device="cuda:0",