TheBloke commited on
Commit
aa3b83c
·
1 Parent(s): a299641

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -12
README.md CHANGED
@@ -7,7 +7,7 @@ language:
7
  - en
8
  library_name: transformers
9
  license: apache-2.0
10
- model_creator: Jet Davis
11
  model_name: OpenInstruct Mistral 7B
12
  model_type: mistral
13
  pipeline_tag: text-generation
@@ -45,13 +45,13 @@ quantized_by: TheBloke
45
  <!-- header end -->
46
 
47
  # OpenInstruct Mistral 7B - GPTQ
48
- - Model creator: [Jet Davis](https://huggingface.co/monology)
49
  - Original model: [OpenInstruct Mistral 7B](https://huggingface.co/monology/openinstruct-mistral-7b)
50
 
51
  <!-- description start -->
52
  # Description
53
 
54
- This repo contains GPTQ model files for [Jet Davis's OpenInstruct Mistral 7B](https://huggingface.co/monology/openinstruct-mistral-7b).
55
 
56
  Multiple GPTQ parameter permutations are provided; see Provided Files below for details of the options provided, their parameters, and the software used to create them.
57
 
@@ -64,7 +64,7 @@ These files were quantised using hardware kindly provided by [Massed Compute](ht
64
  * [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/openinstruct-mistral-7B-AWQ)
65
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ)
66
  * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GGUF)
67
- * [Jet Davis's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/monology/openinstruct-mistral-7b)
68
  <!-- repositories-available end -->
69
 
70
  <!-- prompt-template start -->
@@ -121,12 +121,12 @@ Most GPTQ files are made with AutoGPTQ. Mistral models are currently made with T
121
 
122
  | Branch | Bits | GS | Act Order | Damp % | GPTQ Dataset | Seq Len | Size | ExLlama | Desc |
123
  | ------ | ---- | -- | --------- | ------ | ------------ | ------- | ---- | ------- | ---- |
124
- | [main](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/main) | 4 | 128 | Yes | 0.1 | [open-instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 4.16 GB | Yes | 4-bit, with Act Order and group size 128g. Uses even less VRAM than 64g, but with slightly lower accuracy. |
125
- | [gptq-4bit-32g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-4bit-32g-actorder_True) | 4 | 32 | Yes | 0.1 | [open-instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 4.57 GB | Yes | 4-bit, with Act Order and group size 32g. Gives highest possible inference quality, with maximum VRAM usage. |
126
- | [gptq-8bit--1g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit--1g-actorder_True) | 8 | None | Yes | 0.1 | [open-instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 7.52 GB | No | 8-bit, with Act Order. No group size, to lower VRAM requirements. |
127
- | [gptq-8bit-128g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit-128g-actorder_True) | 8 | 128 | Yes | 0.1 | [open-instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 7.68 GB | No | 8-bit, with group size 128g for higher inference quality and with Act Order for even higher accuracy. |
128
- | [gptq-8bit-32g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit-32g-actorder_True) | 8 | 32 | Yes | 0.1 | [open-instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 8.17 GB | No | 8-bit, with group size 32g and Act Order for maximum inference quality. |
129
- | [gptq-4bit-64g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-4bit-64g-actorder_True) | 4 | 64 | Yes | 0.1 | [open-instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 4.29 GB | Yes | 4-bit, with Act Order and group size 64g. Uses less VRAM than 32g, but with slightly lower accuracy. |
130
 
131
  <!-- README_GPTQ.md-provided-files end -->
132
 
@@ -384,7 +384,7 @@ And thank you again to a16z for their generous grant.
384
 
385
  <!-- footer end -->
386
 
387
- # Original model card: Jet Davis's OpenInstruct Mistral 7B
388
 
389
 
390
  # OpenInstruct Mistral-7B
@@ -395,7 +395,7 @@ Quantized to FP16 and released under the [Apache-2.0](https://choosealicense.com
395
  Compute generously provided by [Higgsfield AI](https://higgsfield.ai/model/655559e6b5777dab620095e0).
396
 
397
 
398
- Prompt format: Alpaca
399
  ```
400
  Below is an instruction that describes a task. Write a response that appropriately completes the request.
401
 
@@ -405,4 +405,10 @@ Below is an instruction that describes a task. Write a response that appropriate
405
  ### Response:
406
  ```
407
 
 
 
 
 
 
 
408
  \*as of 21 Nov 2023. "commercially-usable" includes both an open-source base model and a *non-synthetic* open-source finetune dataset. updated leaderboard results available [here](https://huggingfaceh4-open-llm-leaderboard.hf.space).
 
7
  - en
8
  library_name: transformers
9
  license: apache-2.0
10
+ model_creator: Devin Gulliver
11
  model_name: OpenInstruct Mistral 7B
12
  model_type: mistral
13
  pipeline_tag: text-generation
 
45
  <!-- header end -->
46
 
47
  # OpenInstruct Mistral 7B - GPTQ
48
+ - Model creator: [Devin Gulliver](https://huggingface.co/monology)
49
  - Original model: [OpenInstruct Mistral 7B](https://huggingface.co/monology/openinstruct-mistral-7b)
50
 
51
  <!-- description start -->
52
  # Description
53
 
54
+ This repo contains GPTQ model files for [Devin Gulliver's OpenInstruct Mistral 7B](https://huggingface.co/monology/openinstruct-mistral-7b).
55
 
56
  Multiple GPTQ parameter permutations are provided; see Provided Files below for details of the options provided, their parameters, and the software used to create them.
57
 
 
64
  * [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/openinstruct-mistral-7B-AWQ)
65
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ)
66
  * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GGUF)
67
+ * [Devin Gulliver's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/monology/openinstruct-mistral-7b)
68
  <!-- repositories-available end -->
69
 
70
  <!-- prompt-template start -->
 
121
 
122
  | Branch | Bits | GS | Act Order | Damp % | GPTQ Dataset | Seq Len | Size | ExLlama | Desc |
123
  | ------ | ---- | -- | --------- | ------ | ------------ | ------- | ---- | ------- | ---- |
124
+ | [main](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/main) | 4 | 128 | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 4.16 GB | Yes | 4-bit, with Act Order and group size 128g. Uses even less VRAM than 64g, but with slightly lower accuracy. |
125
+ | [gptq-4bit-32g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-4bit-32g-actorder_True) | 4 | 32 | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 4.57 GB | Yes | 4-bit, with Act Order and group size 32g. Gives highest possible inference quality, with maximum VRAM usage. |
126
+ | [gptq-8bit--1g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit--1g-actorder_True) | 8 | None | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 7.52 GB | No | 8-bit, with Act Order. No group size, to lower VRAM requirements. |
127
+ | [gptq-8bit-128g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit-128g-actorder_True) | 8 | 128 | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 7.68 GB | No | 8-bit, with group size 128g for higher inference quality and with Act Order for even higher accuracy. |
128
+ | [gptq-8bit-32g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit-32g-actorder_True) | 8 | 32 | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 8.17 GB | No | 8-bit, with group size 32g and Act Order for maximum inference quality. |
129
+ | [gptq-4bit-64g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-4bit-64g-actorder_True) | 4 | 64 | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 4.29 GB | Yes | 4-bit, with Act Order and group size 64g. Uses less VRAM than 32g, but with slightly lower accuracy. |
130
 
131
  <!-- README_GPTQ.md-provided-files end -->
132
 
 
384
 
385
  <!-- footer end -->
386
 
387
+ # Original model card: Devin Gulliver's OpenInstruct Mistral 7B
388
 
389
 
390
  # OpenInstruct Mistral-7B
 
395
  Compute generously provided by [Higgsfield AI](https://higgsfield.ai/model/655559e6b5777dab620095e0).
396
 
397
 
398
+ ## Prompt format: Alpaca
399
  ```
400
  Below is an instruction that describes a task. Write a response that appropriately completes the request.
401
 
 
405
  ### Response:
406
  ```
407
 
408
+ ## Recommended preset:
409
+ - temperature: 0.2
410
+ - top_k: 50
411
+ - top_p 0.95
412
+ - repetition_penalty: 1.1
413
+
414
  \*as of 21 Nov 2023. "commercially-usable" includes both an open-source base model and a *non-synthetic* open-source finetune dataset. updated leaderboard results available [here](https://huggingfaceh4-open-llm-leaderboard.hf.space).