Update README.md
Browse files
README.md
CHANGED
@@ -41,21 +41,21 @@ GGML versions are not yet provided, as there is not yet support for SuperHOT in
|
|
41 |
|
42 |
## Repositories available
|
43 |
|
44 |
-
* [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/WizardLM-13B-V1
|
45 |
-
* [Unquantised SuperHOT fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/TheBloke/WizardLM-13B-V1
|
46 |
-
* [Unquantised base fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/ehartford/WizardLM-13B-V1
|
47 |
|
48 |
## How to easily download and use this model in text-generation-webui with ExLlama
|
49 |
|
50 |
Please make sure you're using the latest version of text-generation-webui
|
51 |
|
52 |
1. Click the **Model tab**.
|
53 |
-
2. Under **Download custom model or LoRA**, enter `TheBloke/WizardLM-13B-V1
|
54 |
3. Click **Download**.
|
55 |
4. The model will start downloading. Once it's finished it will say "Done"
|
56 |
5. Untick **Autoload the model**
|
57 |
6. In the top left, click the refresh icon next to **Model**.
|
58 |
-
7. In the **Model** dropdown, choose the model you just downloaded: `WizardLM-13B-V1
|
59 |
8. To use the increased context, set the **Loader** to **ExLlama**, set **max_seq_len** to 8192 or 4096, and set **compress_pos_emb** to **4** for 8192 context, or to **2** for 4096 context.
|
60 |
9. Now click **Save Settings** followed by **Reload**
|
61 |
10. The model will automatically load, and is now ready for use!
|
|
|
41 |
|
42 |
## Repositories available
|
43 |
|
44 |
+
* [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/WizardLM-13B-V1-0-Uncensored-SuperHOT-8K-GPTQ)
|
45 |
+
* [Unquantised SuperHOT fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/TheBloke/WizardLM-13B-V1-0-Uncensored-SuperHOT-8K-fp16)
|
46 |
+
* [Unquantised base fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/ehartford/WizardLM-13B-V1-0-Uncensored)
|
47 |
|
48 |
## How to easily download and use this model in text-generation-webui with ExLlama
|
49 |
|
50 |
Please make sure you're using the latest version of text-generation-webui
|
51 |
|
52 |
1. Click the **Model tab**.
|
53 |
+
2. Under **Download custom model or LoRA**, enter `TheBloke/WizardLM-13B-V1-0-Uncensored-SuperHOT-8K-GPTQ`.
|
54 |
3. Click **Download**.
|
55 |
4. The model will start downloading. Once it's finished it will say "Done"
|
56 |
5. Untick **Autoload the model**
|
57 |
6. In the top left, click the refresh icon next to **Model**.
|
58 |
+
7. In the **Model** dropdown, choose the model you just downloaded: `WizardLM-13B-V1-0-Uncensored-SuperHOT-8K-GPTQ`
|
59 |
8. To use the increased context, set the **Loader** to **ExLlama**, set **max_seq_len** to 8192 or 4096, and set **compress_pos_emb** to **4** for 8192 context, or to **2** for 4096 context.
|
60 |
9. Now click **Save Settings** followed by **Reload**
|
61 |
10. The model will automatically load, and is now ready for use!
|