Update README.md
Browse files
README.md
CHANGED
@@ -34,15 +34,6 @@ inference: false
|
|
34 |
**Description:**
|
35 |
- The motivation behind these quantizations was that latestissue's quants were missing the 0.1B and 0.4B models. The rest of the models can be found here: [latestissue/rwkv-4-world-ggml-quantized](https://huggingface.co/latestissue/rwkv-4-world-ggml-quantized)
|
36 |
|
37 |
-
**RAM usage (WIP):**
|
38 |
-
Model | Startup RAM usage (KoboldCpp)
|
39 |
-
:--:|:--:
|
40 |
-
pygmalion-6b-dev.q4_0.bin | 3.7 GiB
|
41 |
-
pygmalion-6b-dev.q4_1.bin | 4.1 GiB
|
42 |
-
pygmalion-6b-dev.q5_0.bin | 4.4 GiB
|
43 |
-
pygmalion-6b-dev.q5_1.bin | 4.8 GiB
|
44 |
-
pygmalion-6b-dev.q8_0.bin | 6.5 GiB
|
45 |
-
|
46 |
**Notes:**
|
47 |
- rwkv.cpp [[0df970a]](https://github.com/saharNooby/rwkv.cpp/tree/0df970a6adddd4b938795f92e660766d1e2c1c1f) was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
|
48 |
|
|
|
34 |
**Description:**
|
35 |
- The motivation behind these quantizations was that latestissue's quants were missing the 0.1B and 0.4B models. The rest of the models can be found here: [latestissue/rwkv-4-world-ggml-quantized](https://huggingface.co/latestissue/rwkv-4-world-ggml-quantized)
|
36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
37 |
**Notes:**
|
38 |
- rwkv.cpp [[0df970a]](https://github.com/saharNooby/rwkv.cpp/tree/0df970a6adddd4b938795f92e660766d1e2c1c1f) was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
|
39 |
|