Update README.md
Browse files
README.md
CHANGED
@@ -19,20 +19,20 @@ quantized_by: Thireus
|
|
19 |
| ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ |
|
20 |
| [main](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/tree/main) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train) (wikitext-2-raw-v1) | 33GB | [V2](https://github.com/turboderp/exllamav2) | 4096 |
|
21 |
|
22 |
-
## Description
|
23 |
|
24 |
This repository contains EXL2 model files for [WizardLM's WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0).
|
25 |
|
26 |
EXL2 is a new format used by ExLlamaV2 β https://github.com/turboderp/exllamav2. EXL2 is based on the same optimization method as GPTQ. The format allows for mixing quantization
|
27 |
levels within a model to achieve any average bitrate between 2 and 8 bits per weight.
|
28 |
|
29 |
-
## Prompt template (official) β Vicuna
|
30 |
|
31 |
```
|
32 |
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:
|
33 |
```
|
34 |
|
35 |
-
## Prompt template (suggested)
|
36 |
|
37 |
```
|
38 |
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
|
@@ -43,7 +43,7 @@ ASSISTANT:
|
|
43 |
|
44 |
```
|
45 |
|
46 |
-
## Quantization process
|
47 |
|
48 |
| Original Model | β | Float16 Model | β | Safetensor Model | β | EXL2 Model |
|
49 |
| -------------- | --- | ------------- | --- | ---------------- | --- | ---------- |
|
|
|
19 |
| ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ |
|
20 |
| [main](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/tree/main) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train) (wikitext-2-raw-v1) | 33GB | [V2](https://github.com/turboderp/exllamav2) | 4096 |
|
21 |
|
22 |
+
## Description:
|
23 |
|
24 |
This repository contains EXL2 model files for [WizardLM's WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0).
|
25 |
|
26 |
EXL2 is a new format used by ExLlamaV2 β https://github.com/turboderp/exllamav2. EXL2 is based on the same optimization method as GPTQ. The format allows for mixing quantization
|
27 |
levels within a model to achieve any average bitrate between 2 and 8 bits per weight.
|
28 |
|
29 |
+
## Prompt template (official) β Vicuna:
|
30 |
|
31 |
```
|
32 |
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:
|
33 |
```
|
34 |
|
35 |
+
## Prompt template (suggested):
|
36 |
|
37 |
```
|
38 |
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
|
|
|
43 |
|
44 |
```
|
45 |
|
46 |
+
## Quantization process:
|
47 |
|
48 |
| Original Model | β | Float16 Model | β | Safetensor Model | β | EXL2 Model |
|
49 |
| -------------- | --- | ------------- | --- | ---------------- | --- | ---------- |
|