Upload folder using huggingface_hub

Files changed (6) hide show

.gitattributes CHANGED Viewed

@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+Yi-1.5-6B-Chat.f16.gguf filter=lfs diff=lfs merge=lfs -text
+Yi-1.5-6B-Chat.q5_k.gguf filter=lfs diff=lfs merge=lfs -text
+Yi-1.5-6B-Chat.q6_k.gguf filter=lfs diff=lfs merge=lfs -text
+Yi-1.5-6B-Chat.q8_0.gguf filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

+---
+license: mit
+language:
+- en
+---
+My own (ZeroWw) quantizations.
+output and embed tensors quantized to f16.
+all other tensors quantized to q5_k or q6_k.
+Result:
+both f16.q6 and f16.q5 are smaller than q8_0 standard quantization
+and they perform as well as the pure f16.

Yi-1.5-6B-Chat.f16.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:ed2fa5b9cf16c2637040ef2df1552619178ae5045cd85816be11884ed818a966
+size 12124098496

Yi-1.5-6B-Chat.q5_k.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:3a715e164bdbb65a5e97ac79943db9a7dfc11ec2e2741e619e275dacf5dda590
+size 4957736896

Yi-1.5-6B-Chat.q6_k.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:8c9ab10f9f16f586b8abc33d438f25a5f54d14fc01cf2c1d4fc7e1f4723d281a
+size 5592780736

Yi-1.5-6B-Chat.q8_0.gguf ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:a74c44f40ee794eb55b649a25bb1f8d81eb22815830cfde4a7bba74533585235
+size 6933647296