Update README.md
Browse files
README.md
CHANGED
@@ -33,51 +33,39 @@ Want to support me and help pay my cloud computing bill? I also now have a Patre
|
|
33 |
|
34 |
## EXPERIMENTAL
|
35 |
|
36 |
-
Please note this is an experimental
|
|
|
|
|
37 |
|
38 |
To use it you will require:
|
39 |
|
40 |
1. AutoGPTQ, from the latest `main` branch and compiled with `pip install .`
|
41 |
2. `pip install einops`
|
42 |
|
43 |
-
You can then use it immediately from Python code - see example code below
|
44 |
-
|
45 |
-
## text-generation-webui
|
46 |
-
|
47 |
-
There is also provisional AutoGPTQ support in text-generation-webui.
|
48 |
-
|
49 |
-
However at the time I'm writing this, a commit is needed to text-generation-webui to enable it to load this model.
|
50 |
-
|
51 |
-
I have [opened a PR here](https://github.com/oobabooga/text-generation-webui/pull/2374); once this is merged, text-generation-webui will support this GPTQ model.
|
52 |
|
53 |
-
|
54 |
-
1. Edit `text-generation-webui/modules/AutoGPTQ_loader.py`
|
55 |
-
2. Make the following change:
|
56 |
|
57 |
-
|
58 |
```
|
59 |
-
|
|
|
|
|
60 |
```
|
61 |
|
62 |
-
|
63 |
-
```
|
64 |
-
'trust_remote_code': shared.args.trust_remote_code,
|
65 |
-
```
|
66 |
|
67 |
-
|
68 |
|
69 |
-
|
70 |
-
4. Install the latest AutoGPTQ and compile from source - note that this requires compiling the CUDA kernel, which requires CUDA toolkit. This may be an issue for Windows users.
|
71 |
|
72 |
-
|
73 |
-
|
74 |
-
|
75 |
-
pip install . # This step requires CUDA toolkit installed
|
76 |
-
```
|
77 |
|
78 |
## How to download and use this model in text-generation-webui
|
79 |
|
80 |
-
1. Launch text-generation-webui with the following command-line arguments: `--autogptq --
|
81 |
2. Click the **Model tab**.
|
82 |
3. Under **Download custom model or LoRA**, enter `TheBloke/falcon-40B-instruct-GPTQ`.
|
83 |
4. Click **Download**.
|
|
|
33 |
|
34 |
## EXPERIMENTAL
|
35 |
|
36 |
+
Please note this is an experimental GPTQ model. Support for it is currently quite limited.
|
37 |
+
|
38 |
+
It is also expected to be **VERY SLOW**. This is unavoidable at the moment, but is being looked at.
|
39 |
|
40 |
To use it you will require:
|
41 |
|
42 |
1. AutoGPTQ, from the latest `main` branch and compiled with `pip install .`
|
43 |
2. `pip install einops`
|
44 |
|
45 |
+
You can then use it immediately from Python code - see example code below - or from text-generation-webui.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
|
47 |
+
## AutoGPTQ
|
|
|
|
|
48 |
|
49 |
+
To install AutoGPTQ please follow these instructions:
|
50 |
```
|
51 |
+
git clone https://github.com/PanQiWei/AutoGPTQ
|
52 |
+
cd AutoGPTQ
|
53 |
+
pip install .
|
54 |
```
|
55 |
|
56 |
+
These steps will require that you have the [Nvidia CUDA toolkit](https://developer.nvidia.com/cuda-12-0-1-download-archive) installed.
|
|
|
|
|
|
|
57 |
|
58 |
+
## text-generation-webui
|
59 |
|
60 |
+
There is also provisional AutoGPTQ support in text-generation-webui.
|
|
|
61 |
|
62 |
+
This requires text-generation-webui as of commit 204731952ae59d79ea3805a425c73dd171d943c3.
|
63 |
+
|
64 |
+
So please first update text-genration-webui to the latest version.
|
|
|
|
|
65 |
|
66 |
## How to download and use this model in text-generation-webui
|
67 |
|
68 |
+
1. Launch text-generation-webui with the following command-line arguments: `--autogptq --trust-remote-code`
|
69 |
2. Click the **Model tab**.
|
70 |
3. Under **Download custom model or LoRA**, enter `TheBloke/falcon-40B-instruct-GPTQ`.
|
71 |
4. Click **Download**.
|