please work lfs

Files changed (3) hide show

Kyllene-57B-v1.0.q6_K.gguf.imatrix ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:ff162e35edc443601a4a4bf9a2ee79a970957581eb9b467f907711849de615f9
+size 25418434

README.md ADDED Viewed

+---
+license: other
+license_name: yi-license
+license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
+tags:
+- merge, GGUF, imatrix
+---
+## [Kyllene-57B](/TeeZee/Kyllene-57B-v1.0) quantized to 2~3 bpw GGUF
+### NOTICE: I did not use the original file! I started with Q6_K (there was no Q8)
+#### There may well be problems with these quants but I'll eat my own ass if a 57B Q6_K (>6.5bpw) is the root of any of them. More suspect is how I produced the imatrix.
+imatrix included. generated from a 900k text file, also included
+this file was made by concatenating most of the default exllamav2 calibration data. a 900kb file of coherent text only, with some formatting and code but no endless broken html tags or nonsense. includes multilingual, for those deep layers.
+[IQ2_XS](./Kyllene-57B-v1.0.IQ2_XS.gguf/) 2.38 BPW `CUDA0 buffer size = 15941.43 MiB`
+- This file only exists because I did the maths wrong (I was expecting it to be bigger), but I recall that 16GB GPUs exist and I may give it a go with stable diffusion
+IQ2_M breifly existed before I clobbered (technical term) it. It might be back.
+[IQ3_XXS](./Kyllene-57B-v1.0.IQ3_XXS.gguf/) (<3.1 BPW)

techmulcodetiny.utf8 ADDED Viewed

The diff for this file is too large to render. See raw diff