ProphetOfBostrom
commited on
Commit
•
84e3837
1
Parent(s):
624287d
please work lfs
Browse files- Kyllene-57B-v1.0.q6_K.gguf.imatrix +3 -0
- README.md +20 -0
- techmulcodetiny.utf8 +0 -0
Kyllene-57B-v1.0.q6_K.gguf.imatrix
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ff162e35edc443601a4a4bf9a2ee79a970957581eb9b467f907711849de615f9
|
3 |
+
size 25418434
|
README.md
ADDED
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: other
|
3 |
+
license_name: yi-license
|
4 |
+
license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
|
5 |
+
tags:
|
6 |
+
- merge, GGUF, imatrix
|
7 |
+
---
|
8 |
+
## [Kyllene-57B](/TeeZee/Kyllene-57B-v1.0) quantized to 2~3 bpw GGUF
|
9 |
+
### NOTICE: I did not use the original file! I started with Q6_K (there was no Q8)
|
10 |
+
#### There may well be problems with these quants but I'll eat my own ass if a 57B Q6_K (>6.5bpw) is the root of any of them. More suspect is how I produced the imatrix.
|
11 |
+
imatrix included. generated from a 900k text file, also included
|
12 |
+
this file was made by concatenating most of the default exllamav2 calibration data. a 900kb file of coherent text only, with some formatting and code but no endless broken html tags or nonsense. includes multilingual, for those deep layers.
|
13 |
+
|
14 |
+
|
15 |
+
[IQ2_XS](./Kyllene-57B-v1.0.IQ2_XS.gguf/) 2.38 BPW `CUDA0 buffer size = 15941.43 MiB`
|
16 |
+
- This file only exists because I did the maths wrong (I was expecting it to be bigger), but I recall that 16GB GPUs exist and I may give it a go with stable diffusion
|
17 |
+
|
18 |
+
IQ2_M breifly existed before I clobbered (technical term) it. It might be back.
|
19 |
+
|
20 |
+
[IQ3_XXS](./Kyllene-57B-v1.0.IQ3_XXS.gguf/) (<3.1 BPW)
|
techmulcodetiny.utf8
ADDED
The diff for this file is too large to render.
See raw diff
|
|