ZeroWw commited on
Commit
53a7f89
·
verified ·
1 Parent(s): a1a5b56

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ ---
6
+
7
+ My own quantizations.
8
+ output and embed tesnors quantized to f16.
9
+ all other tensors quantized to q5_k or q6_k.
10
+ the q8_0 version is pure (all tensors quantized to Q8_0 just for reference)
11
+
12
+ Result:
13
+ both f16.q6 and f16.q5 are smaller than q8_0 standard quantization
14
+ and they perform as well as the pure f16.