Thireus commited on
Commit
efd5fb2
·
1 Parent(s): 5144fcd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -18,10 +18,12 @@ quantized_by: Thireus
18
  | Branch | BITS (-b) | HEAD BITS (-hb) | MEASUREMENT LENGTH (-ml) | LENGTH (-l) | CAL DATASET (-c) | Size | ExLlama | Max Context Length |
19
  | ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ |
20
  | [main](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/tree/main) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train) (wikitext-2-raw-v1) | 33GB | [v2](https://github.com/turboderp/exllamav2) | 4096 |
 
 
21
 
22
  ## Description:
23
 
24
- This repository contains EXL2 model files for [WizardLM's WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0).
25
 
26
  EXL2 is a new format used by ExLlamaV2 – https://github.com/turboderp/exllamav2. EXL2 is based on the same optimization method as GPTQ. The format allows for mixing quantization
27
  levels within a model to achieve any average bitrate between 2 and 8 bits per weight.
@@ -56,7 +58,7 @@ mkdir -p ~/EXL2/WizardLM-70B-V1.0-HF_4bit # Create the output directory
56
  python convert.py -i ~/float16_safetensored/WizardLM-70B-V1.0-HF -o ~/EXL2/WizardLM-70B-V1.0-HF_4bit -c ~/EXL2/0000.parquet -b 4.0 -hb 6
57
  ```
58
 
59
- (*) Use any one of the following scripts to convert your local pytorch_model bin files to safetensors:
60
 
61
  - https://github.com/turboderp/exllamav2/blob/master/util/convert_safetensors.py (official ExLlamaV2)
62
  - https://huggingface.co/Panchovix/airoboros-l2-70b-gpt4-1.4.1-safetensors/blob/main/bin2safetensors/convert.py (recommended)
 
18
  | Branch | BITS (-b) | HEAD BITS (-hb) | MEASUREMENT LENGTH (-ml) | LENGTH (-l) | CAL DATASET (-c) | Size | ExLlama | Max Context Length |
19
  | ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ |
20
  | [main](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/tree/main) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train) (wikitext-2-raw-v1) | 33GB | [v2](https://github.com/turboderp/exllamav2) | 4096 |
21
+ | _coming soon..._ | 5.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train) (wikitext-2-raw-v1) | 33GB | [v2](https://github.com/turboderp/exllamav2) | 4096 |
22
+ | _coming soon..._ | 6.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train) (wikitext-2-raw-v1) | 33GB | [v2](https://github.com/turboderp/exllamav2) | 4096 |
23
 
24
  ## Description:
25
 
26
+ _This repository contains EXL2 model files for [WizardLM's WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)._
27
 
28
  EXL2 is a new format used by ExLlamaV2 – https://github.com/turboderp/exllamav2. EXL2 is based on the same optimization method as GPTQ. The format allows for mixing quantization
29
  levels within a model to achieve any average bitrate between 2 and 8 bits per weight.
 
58
  python convert.py -i ~/float16_safetensored/WizardLM-70B-V1.0-HF -o ~/EXL2/WizardLM-70B-V1.0-HF_4bit -c ~/EXL2/0000.parquet -b 4.0 -hb 6
59
  ```
60
 
61
+ \* Use any one of the following scripts to convert your local pytorch_model bin files to safetensors:
62
 
63
  - https://github.com/turboderp/exllamav2/blob/master/util/convert_safetensors.py (official ExLlamaV2)
64
  - https://huggingface.co/Panchovix/airoboros-l2-70b-gpt4-1.4.1-safetensors/blob/main/bin2safetensors/convert.py (recommended)