File size: 7,911 Bytes
b28bde3 17427b0 b28bde3 d76390b a17cf9f 794abee d95da59 b28bde3 ae22bd6 d46ce4e a9cf52d e2055ef fd6acd6 8faaee4 1ea94ed a1161aa ace14d0 1d06379 aac2f0d 4fd004d fb0e280 ff34603 cb5863d f834089 7478c69 aecf982 6e2086d f0fab25 b28bde3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
---
base_model: OpenLLM-Ro/RoMistral-7b-Instruct
inference: false
language:
- ro
library_name: gguf
license: cc-by-nc-4.0
pipeline_tag: text-generation
quantized_by: legraphista
tags:
- quantized
- GGUF
- imatrix
- quantization
---
# RoMistral-7b-Instruct-IMat-GGUF
_Llama.cpp imatrix quantization of RoMistral-7b-Instruct-IMat-GGUF_
Original Model: [OpenLLM-Ro/RoMistral-7b-Instruct](https://huggingface.co./OpenLLM-Ro/RoMistral-7b-Instruct)
Original dtype: `BF16` (`bfloat16`)
Quantized by: llama.cpp [b2998](https://github.com/ggerganov/llama.cpp/releases/tag/b2998)
IMatrix dataset: [here](https://gist.githubusercontent.com/legraphista/d6d93f1a254bcfc58e0af3777eaec41e/raw/d380e7002cea4a51c33fffd47db851942754e7cc/imatrix.calibration.medium.raw)
## Files
### IMatrix
Status: β
Available
Link: [here](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/imatrix.dat)
### Common Quants
| Filename | Quant type | File Size | Status | Uses IMatrix | Is Split |
| -------- | ---------- | --------- | ------ | ------------ | -------- |
| [RoMistral-7b-Instruct.Q8_0.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q8_0.gguf) | Q8_0 | 7.70GB | β
Available | βͺ No | π¦ No
| [RoMistral-7b-Instruct.Q6_K.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q6_K.gguf) | Q6_K | 5.94GB | β
Available | βͺ No | π¦ No
| [RoMistral-7b-Instruct.Q4_K.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q4_K.gguf) | Q4_K | 4.37GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.Q3_K.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q3_K.gguf) | Q3_K | 3.52GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.Q2_K.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q2_K.gguf) | Q2_K | 2.72GB | β
Available | π’ Yes | π¦ No
### All Quants
| Filename | Quant type | File Size | Status | Uses IMatrix | Is Split |
| -------- | ---------- | --------- | ------ | ------------ | -------- |
| [RoMistral-7b-Instruct.FP16.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.FP16.gguf) | F16 | 14.48GB | β
Available | βͺ No | π¦ No
| [RoMistral-7b-Instruct.BF16.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.BF16.gguf) | BF16 | 14.48GB | β
Available | βͺ No | π¦ No
| [RoMistral-7b-Instruct.Q5_K.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q5_K.gguf) | Q5_K | 5.13GB | β
Available | βͺ No | π¦ No
| [RoMistral-7b-Instruct.Q5_K_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q5_K_S.gguf) | Q5_K_S | 5.00GB | β
Available | βͺ No | π¦ No
| [RoMistral-7b-Instruct.Q4_K_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q4_K_S.gguf) | Q4_K_S | 4.14GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.Q3_K_L.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q3_K_L.gguf) | Q3_K_L | 3.82GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.Q3_K_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q3_K_S.gguf) | Q3_K_S | 3.16GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.Q2_K_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.Q2_K_S.gguf) | Q2_K_S | 2.53GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ4_NL.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ4_NL.gguf) | IQ4_NL | 4.13GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ4_XS.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ4_XS.gguf) | IQ4_XS | 3.91GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ3_M.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ3_M.gguf) | IQ3_M | 3.28GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ3_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ3_S.gguf) | IQ3_S | 3.18GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ3_XS.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ3_XS.gguf) | IQ3_XS | 3.02GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ3_XXS.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ3_XXS.gguf) | IQ3_XXS | 2.83GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ2_M.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ2_M.gguf) | IQ2_M | 2.50GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ2_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ2_S.gguf) | IQ2_S | 2.31GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ2_XS.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ2_XS.gguf) | IQ2_XS | 2.20GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ2_XXS.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ2_XXS.gguf) | IQ2_XXS | 1.99GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ1_M.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ1_M.gguf) | IQ1_M | 1.75GB | β
Available | π’ Yes | π¦ No
| [RoMistral-7b-Instruct.IQ1_S.gguf](https://huggingface.co./legraphista/RoMistral-7b-Instruct-IMat-GGUF/blob/main/RoMistral-7b-Instruct.IQ1_S.gguf) | IQ1_S | 1.61GB | β
Available | π’ Yes | π¦ No
## Downloading using huggingface-cli
First, make sure you have hugginface-cli installed:
```
pip install -U "huggingface_hub[cli]"
```
Then, you can target the specific file you want:
```
huggingface-cli download legraphista/RoMistral-7b-Instruct-IMat-GGUF --include "RoMistral-7b-Instruct.Q8_0.gguf" --local-dir ./
```
If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
```
huggingface-cli download legraphista/RoMistral-7b-Instruct-IMat-GGUF --include "RoMistral-7b-Instruct.Q8_0/*" --local-dir RoMistral-7b-Instruct.Q8_0
# see FAQ for merging GGUF's
```
## FAQ
### Why is the IMatrix not applied everywhere?
According to [this investigation](https://www.reddit.com/r/LocalLLaMA/comments/1993iro/ggufs_quants_can_punch_above_their_weights_now/), it appears that lower quantizations are the only ones that benefit from the imatrix input (as per hellaswag results).
### How do I merge a split GGUF?
1. Make sure you have `gguf-split` available
- To get hold of `gguf-split`, navigate to https://github.com/ggerganov/llama.cpp/releases
- Download the appropriate zip for your system from the latest release
- Unzip the archive and you should be able to find `gguf-split`
2. Locate your GGUF chunks folder (ex: `RoMistral-7b-Instruct.Q8_0`)
3. Run `gguf-split --merge RoMistral-7b-Instruct.Q8_0/RoMistral-7b-Instruct.Q8_0-00001-of-XXXXX.gguf RoMistral-7b-Instruct.Q8_0.gguf`
- Make sure to point `gguf-split` to the first chunk of the split.
---
Got a suggestion? Ping me [@legraphista](https://x.com/legraphista)! |