File size: 7,911 Bytes
b28bde3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17427b0
b28bde3
 
 
 
 
d76390b
a17cf9f
794abee
 
d95da59
b28bde3
 
 
 
 
ae22bd6
d46ce4e
a9cf52d
e2055ef
fd6acd6
8faaee4
1ea94ed
a1161aa
ace14d0
1d06379
aac2f0d
4fd004d
fb0e280
ff34603
cb5863d
f834089
7478c69
aecf982
6e2086d
f0fab25
b28bde3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
base_model: OpenLLM-Ro/RoMistral-7b-Instruct
inference: false
language:
- ro
library_name: gguf
license: cc-by-nc-4.0
pipeline_tag: text-generation
quantized_by: legraphista
tags:
- quantized
- GGUF
- imatrix
- quantization
---

# RoMistral-7b-Instruct-IMat-GGUF
_Llama.cpp imatrix quantization of RoMistral-7b-Instruct-IMat-GGUF_

Original Model: [OpenLLM-Ro/RoMistral-7b-Instruct](https://huggingface.co./OpenLLM-Ro/RoMistral-7b-Instruct)  
Original dtype: `BF16` (`bfloat16`)  
Quantized by: llama.cpp [b2998](https://github.com/ggerganov/llama.cpp/releases/tag/b2998)  
IMatrix dataset: [here](https://gist.githubusercontent.com/legraphista/d6d93f1a254bcfc58e0af3777eaec41e/raw/d380e7002cea4a51c33fffd47db851942754e7cc/imatrix.calibration.medium.raw)  

## Files

### IMatrix
Status: βœ… Available  
Link: [here](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/imatrix.dat) 

### Common Quants
| Filename | Quant type | File Size | Status | Uses IMatrix | Is Split |
| -------- | ---------- | --------- | ------ | ------------ | -------- |
| [RoMistral-7b-Instruct.Q8_0.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q8_0.gguf) | Q8_0 | 7.70GB | βœ… Available | βšͺ No | πŸ“¦ No
| [RoMistral-7b-Instruct.Q6_K.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q6_K.gguf) | Q6_K | 5.94GB | βœ… Available | βšͺ No | πŸ“¦ No
| [RoMistral-7b-Instruct.Q4_K.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q4_K.gguf) | Q4_K | 4.37GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.Q3_K.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q3_K.gguf) | Q3_K | 3.52GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.Q2_K.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q2_K.gguf) | Q2_K | 2.72GB | βœ… Available | 🟒 Yes | πŸ“¦ No


### All Quants
| Filename | Quant type | File Size | Status | Uses IMatrix | Is Split |
| -------- | ---------- | --------- | ------ | ------------ | -------- |
| [RoMistral-7b-Instruct.FP16.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.FP16.gguf) | F16 | 14.48GB | βœ… Available | βšͺ No | πŸ“¦ No
| [RoMistral-7b-Instruct.BF16.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.BF16.gguf) | BF16 | 14.48GB | βœ… Available | βšͺ No | πŸ“¦ No
| [RoMistral-7b-Instruct.Q5_K.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q5_K.gguf) | Q5_K | 5.13GB | βœ… Available | βšͺ No | πŸ“¦ No
| [RoMistral-7b-Instruct.Q5_K_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q5_K_S.gguf) | Q5_K_S | 5.00GB | βœ… Available | βšͺ No | πŸ“¦ No
| [RoMistral-7b-Instruct.Q4_K_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q4_K_S.gguf) | Q4_K_S | 4.14GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.Q3_K_L.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q3_K_L.gguf) | Q3_K_L | 3.82GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.Q3_K_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q3_K_S.gguf) | Q3_K_S | 3.16GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.Q2_K_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q2_K_S.gguf) | Q2_K_S | 2.53GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ4_NL.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ4_NL.gguf) | IQ4_NL | 4.13GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ4_XS.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ4_XS.gguf) | IQ4_XS | 3.91GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ3_M.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ3_M.gguf) | IQ3_M | 3.28GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ3_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ3_S.gguf) | IQ3_S | 3.18GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ3_XS.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ3_XS.gguf) | IQ3_XS | 3.02GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ3_XXS.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ3_XXS.gguf) | IQ3_XXS | 2.83GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ2_M.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ2_M.gguf) | IQ2_M | 2.50GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ2_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ2_S.gguf) | IQ2_S | 2.31GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ2_XS.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ2_XS.gguf) | IQ2_XS | 2.20GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ2_XXS.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ2_XXS.gguf) | IQ2_XXS | 1.99GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ1_M.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ1_M.gguf) | IQ1_M | 1.75GB | βœ… Available | 🟒 Yes | πŸ“¦ No
| [RoMistral-7b-Instruct.IQ1_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ1_S.gguf) | IQ1_S | 1.61GB | βœ… Available | 🟒 Yes | πŸ“¦ No


## Downloading using huggingface-cli
First, make sure you have hugginface-cli installed:
```
pip install -U "huggingface_hub[cli]"
```
Then, you can target the specific file you want:
```
huggingface-cli download legraphista/RoMistral-7b-Instruct-IMat-GGUF --include "RoMistral-7b-Instruct.Q8_0.gguf" --local-dir ./
```
If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
```
huggingface-cli download legraphista/RoMistral-7b-Instruct-IMat-GGUF --include "RoMistral-7b-Instruct.Q8_0/*" --local-dir RoMistral-7b-Instruct.Q8_0
# see FAQ for merging GGUF's
```

## FAQ

### Why is the IMatrix not applied everywhere?
According to [this investigation](https://www.reddit.com/r/LocalLLaMA/comments/1993iro/ggufs_quants_can_punch_above_their_weights_now/), it appears that lower quantizations are the only ones that benefit from the imatrix input (as per hellaswag results). 

### How do I merge a split GGUF?
1. Make sure you have `gguf-split` available
    - To get hold of `gguf-split`, navigate to https://github.com/ggerganov/llama.cpp/releases
    - Download the appropriate zip for your system from the latest release
    - Unzip the archive and you should be able to find `gguf-split`
2. Locate your GGUF chunks folder (ex: `RoMistral-7b-Instruct.Q8_0`)
3. Run `gguf-split --merge RoMistral-7b-Instruct.Q8_0/RoMistral-7b-Instruct.Q8_0-00001-of-XXXXX.gguf RoMistral-7b-Instruct.Q8_0.gguf`
    - Make sure to point `gguf-split` to the first chunk of the split.

---

Got a suggestion? Ping me [@legraphista](https://x.com/legraphista)!