File size: 4,934 Bytes
34ed57e
c327d1c
34ed57e
ebd38e1
 
 
 
34ed57e
 
 
 
 
 
 
166a86a
34ed57e
166a86a
 
eec825f
b06c99f
166a86a
 
 
 
 
 
34ed57e
 
d656394
 
34ed57e
 
eec825f
45817c4
a3a3546
 
 
 
 
7d19746
a3a3546
 
7d19746
a3a3546
 
 
 
6f6db6c
45817c4
c5a405b
553821a
7d19746
a3a3546
553821a
43129f1
166a86a
08fc1f5
 
d656394
08fc1f5
d656394
88b5004
 
 
c327d1c
 
 
 
 
ca35b0f
 
 
 
 
 
34ed57e
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
base_model: mistralai/Mixtral-8x7B-v0.1
language:
- fr
- it
- de
- es
- en
library_name: transformers
license: apache-2.0
quantized_by: mradermacher
tags:
- moe
---
## About

weighted/imatrix quants of https://huggingface.co./mistralai/Mixtral-8x7B-v0.1

<!-- provided-files -->
static quants are available at https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-GGUF
## Usage

If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co./TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.

## Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ1_S.gguf) | i1-IQ1_S | 9.8 | for the desperate |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ1_M.gguf) | i1-IQ1_M | 11.1 | mostly desperate |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 12.6 |  |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ2_XS.gguf) | i1-IQ2_XS | 13.9 |  |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ2_S.gguf) | i1-IQ2_S | 14.4 |  |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ2_M.gguf) | i1-IQ2_M | 15.8 |  |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-Q2_K.gguf) | i1-Q2_K | 17.6 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 18.6 | lower quality |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-Q3_K_XS.gguf) | i1-Q3_K_XS | 19.3 |  |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ3_XS.gguf) | i1-IQ3_XS | 19.5 |  |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ3_S.gguf) | i1-IQ3_S | 20.7 | beats Q3_K* |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-Q3_K_S.gguf) | i1-Q3_K_S | 20.7 | IQ3_XS probably better |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ3_M.gguf) | i1-IQ3_M | 21.7 |  |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-Q3_K_M.gguf) | i1-Q3_K_M | 22.8 | IQ3_S probably better |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-Q3_K_L.gguf) | i1-Q3_K_L | 24.4 | IQ3_M probably better |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ4_XS.gguf) | i1-IQ4_XS | 25.3 |  |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-IQ4_NL.gguf) | i1-IQ4_NL | 26.8 | prefer IQ4_XS |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-Q4_0.gguf) | i1-Q4_0 | 26.8 | fast, low quality |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-Q4_K_S.gguf) | i1-Q4_K_S | 27.0 | optimal size/speed/quality |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-Q4_K_M.gguf) | i1-Q4_K_M | 28.7 | fast, recommended |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-Q5_K_S.gguf) | i1-Q5_K_S | 32.5 |  |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-Q5_K_M.gguf) | i1-Q5_K_M | 33.5 |  |
| [GGUF](https://huggingface.co./mradermacher/Mixtral-8x7B-v0.1-i1-GGUF/resolve/main/Mixtral-8x7B-v0.1.i1-Q6_K.gguf) | i1-Q6_K | 38.6 | practically like static Q6_K |

Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):

![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

## FAQ / Model Request

See https://huggingface.co./mradermacher/model_requests for some answers to
questions you might have and/or if you want some other model quantized.

## Thanks

I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time.

<!-- end -->