bartowski commited on
Commit
8aaea32
1 Parent(s): a3ca77c

Add bits table

Browse files
Files changed (1) hide show
  1. README.md +11 -15
README.md CHANGED
@@ -20,23 +20,19 @@ Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.0.11">turb
20
 
21
  Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions.
22
 
23
- Conversion was done using the default calibration dataset.
24
-
25
- Default arguments used except when the bits per weight is above 6.0, at that point the lm_head layer is quantized at 8 bits per weight instead of the default 6.
26
-
27
  Original model: https://huggingface.co/mlabonne/NeuralBeagle14-7B
28
 
 
29
 
 
 
 
 
 
 
 
30
 
31
- <a href="https://huggingface.co/bartowski/NeuralBeagle14-7B-exl2/tree/8_0">8.0 bits per weight</a>
32
-
33
- <a href="https://huggingface.co/bartowski/NeuralBeagle14-7B-exl2/tree/6_5">6.5 bits per weight</a>
34
-
35
- <a href="https://huggingface.co/bartowski/NeuralBeagle14-7B-exl2/tree/5_0">5.0 bits per weight</a>
36
-
37
- <a href="https://huggingface.co/bartowski/NeuralBeagle14-7B-exl2/tree/4_0">4.0 bits per weight</a>
38
-
39
- <a href="https://huggingface.co/bartowski/NeuralBeagle14-7B-exl2/tree/3_5">3.5 bits per weight</a>
40
 
41
  ## Download instructions
42
 
@@ -62,6 +58,6 @@ huggingface-cli download bartowski/NeuralBeagle14-7B-exl2 --local-dir NeuralBeag
62
  To download from a different branch, add the `--revision` parameter:
63
 
64
  ```shell
65
- mkdir NeuralBeagle14-7B-exl2
66
- huggingface-cli download bartowski/NeuralBeagle14-7B-exl2 --revision 4_0 --local-dir NeuralBeagle14-7B-exl2 --local-dir-use-symlinks False
67
  ```
 
20
 
21
  Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions.
22
 
 
 
 
 
23
  Original model: https://huggingface.co/mlabonne/NeuralBeagle14-7B
24
 
25
+ Model Size: 7b
26
 
27
+ | Branch | Bits | lm_head bits | Dataset | Size | Description |
28
+ | ----- | ---- | ------- | ------- | ------ | ------------ |
29
+ | [8_0](https://huggingface.co/Bartowski/NeuralBeagle14-7B-exl2/tree/8_0) | 8.0 | 8.0 | Default | 9.8 GB | Maximum quality that ExLlamaV2 can produce, near unquantized performance. |
30
+ | [6_5](https://huggingface.co/Bartowski/NeuralBeagle14-7B-exl2/tree/6_5) | 6.5 | 8.0 | Default | 8.6 GB | Very similar to 8.0, good tradeoff of size vs performance, **recommended**. |
31
+ | [5_0](https://huggingface.co/Bartowski/NeuralBeagle14-7B-exl2/tree/5_0) | 5.0 | 6.0 | Default | 7.4 GB | Slightly lower perplexity vs 6.5. |
32
+ | [4_0](https://huggingface.co/Bartowski/NeuralBeagle14-7B-exl2/tree/4_0) | 4.0 | 6.0 | Default | 6.5 GB | Just under GPTQ equivalent bits per weight. |
33
+ | [3_5](https://huggingface.co/Bartowski/NeuralBeagle14-7B-exl2/tree/3_5) | 3.5 | 6.0 | Default | 6.1 GB | Lower quality, only use if you have to. |
34
 
35
+ All VRAM requirements estimated from 16k context. For 32k context add ~2 GB.
 
 
 
 
 
 
 
 
36
 
37
  ## Download instructions
38
 
 
58
  To download from a different branch, add the `--revision` parameter:
59
 
60
  ```shell
61
+ mkdir NeuralBeagle14-7B-exl2-6_5
62
+ huggingface-cli download bartowski/NeuralBeagle14-7B-exl2 --revision 6_5 --local-dir NeuralBeagle14-7B-exl2-6_5 --local-dir-use-symlinks False
63
  ```