Thireus commited on
Commit
9dcf1ce
Β·
1 Parent(s): 9d90c75

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -19,20 +19,20 @@ quantized_by: Thireus
19
  | ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ |
20
  | [main](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/tree/main) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train) (wikitext-2-raw-v1) | 33GB | [V2](https://github.com/turboderp/exllamav2) | 4096 |
21
 
22
- ## Description
23
 
24
  This repository contains EXL2 model files for [WizardLM's WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0).
25
 
26
  EXL2 is a new format used by ExLlamaV2 – https://github.com/turboderp/exllamav2. EXL2 is based on the same optimization method as GPTQ. The format allows for mixing quantization
27
  levels within a model to achieve any average bitrate between 2 and 8 bits per weight.
28
 
29
- ## Prompt template (official) – Vicuna
30
 
31
  ```
32
  A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:
33
  ```
34
 
35
- ## Prompt template (suggested)
36
 
37
  ```
38
  A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
@@ -43,7 +43,7 @@ ASSISTANT:
43
 
44
  ```
45
 
46
- ## Quantization process
47
 
48
  | Original Model | β†’ | Float16 Model | β†’ | Safetensor Model | β†’ | EXL2 Model |
49
  | -------------- | --- | ------------- | --- | ---------------- | --- | ---------- |
 
19
  | ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ |
20
  | [main](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/tree/main) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train) (wikitext-2-raw-v1) | 33GB | [V2](https://github.com/turboderp/exllamav2) | 4096 |
21
 
22
+ ## Description:
23
 
24
  This repository contains EXL2 model files for [WizardLM's WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0).
25
 
26
  EXL2 is a new format used by ExLlamaV2 – https://github.com/turboderp/exllamav2. EXL2 is based on the same optimization method as GPTQ. The format allows for mixing quantization
27
  levels within a model to achieve any average bitrate between 2 and 8 bits per weight.
28
 
29
+ ## Prompt template (official) – Vicuna:
30
 
31
  ```
32
  A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:
33
  ```
34
 
35
+ ## Prompt template (suggested):
36
 
37
  ```
38
  A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
 
43
 
44
  ```
45
 
46
+ ## Quantization process:
47
 
48
  | Original Model | β†’ | Float16 Model | β†’ | Safetensor Model | β†’ | EXL2 Model |
49
  | -------------- | --- | ------------- | --- | ---------------- | --- | ---------- |