NeoChen1024 commited on
Commit
d354767
1 Parent(s): b140b81

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -7,9 +7,9 @@ base_model:
7
  IQ1_S (8.0G, 16.8624 +/- 0.24892, fits into 10GiB VRAM, just for kicks and giggles, not really usable)
8
  IQ1_M (8.6G, 13.9588 +/- 0.19871, fits into 12GiB VRAM, just for kicks and giggles, not really usable)
9
  IQ2_M ( 12G, 10.1401 +/- 0.14062, fits into 16GiB VRAM + 6144 context with q4_1 KV cache)
10
- IQ4_XS ( 18G, 9.4489 +/- 0.13005, fits into 24GiB VRAM + 8192 context with q4_1 KV cache, also room for 2048 ubatch)
11
- IQ4_NL ( 19G, 9.4632 +/- 0.13056, fits into 24GiB VRAM + 8192 context with q4_1 KV cache)
12
- Q4_K_M ( 21G, 9.3738 +/- 0.12900, fits into 24GiB VRAM + 6144 context with q4_1 KV cache, also good for CPU inference on E5-26xx v3/v4)
13
- Q8_0 ( 35G, 9.3277 +/- 0.12781, probably isn't practical for anything unless you have big GPU array, imatrix derived from it)
14
  ```
15
  Perplexity measured with `-fa -ctv q4_1 -ctk q4_1 -c 2048 -ub 2048` on UTF-8 text version of ["Wired Love" from Project Gutenberg](http://www.gutenberg.org/ebooks/24353).
 
7
  IQ1_S (8.0G, 16.8624 +/- 0.24892, fits into 10GiB VRAM, just for kicks and giggles, not really usable)
8
  IQ1_M (8.6G, 13.9588 +/- 0.19871, fits into 12GiB VRAM, just for kicks and giggles, not really usable)
9
  IQ2_M ( 12G, 10.1401 +/- 0.14062, fits into 16GiB VRAM + 6144 context with q4_1 KV cache)
10
+ IQ4_XS ( 18G, 9.4489 +/- 0.13005, fits into 24GiB VRAM + 8192 context with q4_1 KV cache, also room for 2048 ubatch)
11
+ IQ4_NL ( 19G, 9.4632 +/- 0.13056, fits into 24GiB VRAM + 8192 context with q4_1 KV cache)
12
+ Q4_K_M ( 21G, 9.3738 +/- 0.12900, fits into 24GiB VRAM + 6144 context with q4_1 KV cache, also good for CPU inference on E5-26xx v3/v4)
13
+ Q8_0 ( 35G, 9.3277 +/- 0.12781, probably isn't practical for anything unless you have big GPU array, imatrix derived from it)
14
  ```
15
  Perplexity measured with `-fa -ctv q4_1 -ctk q4_1 -c 2048 -ub 2048` on UTF-8 text version of ["Wired Love" from Project Gutenberg](http://www.gutenberg.org/ebooks/24353).