Text Generation
rwkv
causal-lm
ggml
Crataco commited on
Commit
b622395
1 Parent(s): 58d0487

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -9
README.md CHANGED
@@ -34,15 +34,6 @@ inference: false
34
  **Description:**
35
  - The motivation behind these quantizations was that latestissue's quants were missing the 0.1B and 0.4B models. The rest of the models can be found here: [latestissue/rwkv-4-world-ggml-quantized](https://huggingface.co/latestissue/rwkv-4-world-ggml-quantized)
36
 
37
- **RAM usage (WIP):**
38
- Model | Startup RAM usage (KoboldCpp)
39
- :--:|:--:
40
- pygmalion-6b-dev.q4_0.bin | 3.7 GiB
41
- pygmalion-6b-dev.q4_1.bin | 4.1 GiB
42
- pygmalion-6b-dev.q5_0.bin | 4.4 GiB
43
- pygmalion-6b-dev.q5_1.bin | 4.8 GiB
44
- pygmalion-6b-dev.q8_0.bin | 6.5 GiB
45
-
46
  **Notes:**
47
  - rwkv.cpp [[0df970a]](https://github.com/saharNooby/rwkv.cpp/tree/0df970a6adddd4b938795f92e660766d1e2c1c1f) was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
48
 
 
34
  **Description:**
35
  - The motivation behind these quantizations was that latestissue's quants were missing the 0.1B and 0.4B models. The rest of the models can be found here: [latestissue/rwkv-4-world-ggml-quantized](https://huggingface.co/latestissue/rwkv-4-world-ggml-quantized)
36
 
 
 
 
 
 
 
 
 
 
37
  **Notes:**
38
  - rwkv.cpp [[0df970a]](https://github.com/saharNooby/rwkv.cpp/tree/0df970a6adddd4b938795f92e660766d1e2c1c1f) was used for conversion and quantization. First they were converted to f16 ggml files, then quantized.
39