Thireus
/

WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2

Text Generation

Model card Files Files and versions Community

Thireus commited on Sep 17, 2023

Commit

9dcf1ce

·

1 Parent(s): 9d90c75

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -19,20 +19,20 @@ quantized_by: Thireus
 | ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ |
 | [main](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/tree/main) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train) (wikitext-2-raw-v1) | 33GB | [V2](https://github.com/turboderp/exllamav2) | 4096 |
-## Description
 This repository contains EXL2 model files for [WizardLM's WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0).
 EXL2 is a new format used by ExLlamaV2 – https://github.com/turboderp/exllamav2. EXL2 is based on the same optimization method as GPTQ. The format allows for mixing quantization
 levels within a model to achieve any average bitrate between 2 and 8 bits per weight.
-## Prompt template (official) – Vicuna
 ```
 A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:
 ```
-## Prompt template (suggested)
 ```
 A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
@@ -43,7 +43,7 @@ ASSISTANT:
 ```
-## Quantization process
 | Original Model | → | Float16 Model | → | Safetensor Model | → | EXL2 Model |
 | -------------- | --- | ------------- | --- | ---------------- | --- | ---------- |

 | ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ |
 | [main](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/tree/main) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train) (wikitext-2-raw-v1) | 33GB | [V2](https://github.com/turboderp/exllamav2) | 4096 |
+## Description:
 This repository contains EXL2 model files for [WizardLM's WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0).
 EXL2 is a new format used by ExLlamaV2 – https://github.com/turboderp/exllamav2. EXL2 is based on the same optimization method as GPTQ. The format allows for mixing quantization
 levels within a model to achieve any average bitrate between 2 and 8 bits per weight.
+## Prompt template (official) – Vicuna:
 ```
 A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:
 ```
+## Prompt template (suggested):
 ```
 A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
 ```
+## Quantization process:
 | Original Model | → | Float16 Model | → | Safetensor Model | → | EXL2 Model |
 | -------------- | --- | ------------- | --- | ---------------- | --- | ---------- |