Lewdiculous
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -22,14 +22,14 @@ The **Imatrix** is calculated based on calibration data, and it helps determine
|
|
22 |
|
23 |
One of the benefits of using an Imatrix is that it can lead to better model performance, especially when the calibration data is diverse.
|
24 |
|
|
|
|
|
25 |
For --imatrix data, `imatrix-Loyal-Toppy-Bruins-Maid-7B-DARE-F16.dat` was used.
|
26 |
|
27 |
`Base⇢ GGUF(F16)⇢ Imatrix-Data(F16)⇢ GGUF(Imatrix-Quants)`
|
28 |
|
29 |
Using [llama.cpp](https://github.com/ggerganov/llama.cpp/)-[b2280](https://github.com/ggerganov/llama.cpp/releases/tag/b2280).
|
30 |
|
31 |
-
[[More information in this discussion.]](https://github.com/ggerganov/llama.cpp/discussions/5006)
|
32 |
-
|
33 |
The new **IQ3_S** quant-option has shown to be better than the old Q3_K_S, so I added that instead of the later. Only supported in `koboldcpp-1.59.1` or higher.
|
34 |
|
35 |
*If you want any specific quantization to be added, feel free to ask.*
|
|
|
22 |
|
23 |
One of the benefits of using an Imatrix is that it can lead to better model performance, especially when the calibration data is diverse.
|
24 |
|
25 |
+
More information: [[1]](https://github.com/ggerganov/llama.cpp/discussions/5006) [[2]](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
|
26 |
+
|
27 |
For --imatrix data, `imatrix-Loyal-Toppy-Bruins-Maid-7B-DARE-F16.dat` was used.
|
28 |
|
29 |
`Base⇢ GGUF(F16)⇢ Imatrix-Data(F16)⇢ GGUF(Imatrix-Quants)`
|
30 |
|
31 |
Using [llama.cpp](https://github.com/ggerganov/llama.cpp/)-[b2280](https://github.com/ggerganov/llama.cpp/releases/tag/b2280).
|
32 |
|
|
|
|
|
33 |
The new **IQ3_S** quant-option has shown to be better than the old Q3_K_S, so I added that instead of the later. Only supported in `koboldcpp-1.59.1` or higher.
|
34 |
|
35 |
*If you want any specific quantization to be added, feel free to ask.*
|