File size: 7,152 Bytes
ede6619
 
 
 
 
 
 
 
1791e0e
1a6ee8b
 
8bfedc0
 
c7d74ac
8bfedc0
 
 
b619c9e
7c2d3a7
 
 
b619c9e
 
 
1cfa27f
7c2d3a7
2f4c54c
b92636d
02e86fe
a76b1ec
 
f2a9702
56c77b3
28b1262
 
0d4d5fe
 
50e71c3
6febea5
446a4be
 
 
 
5aa7e65
28f241a
9082801
807ef7a
2a8cb31
84158d4
0c189c3
d4cf064
ad217fe
1cfa27f
08104e4
3eff5ff
 
 
bea9854
15a81b0
fcaf11e
c8b0e4a
99240fa
c004afa
6d5b3e6
49809da
84f02c7
2c59f6b
2dbe16a
781eb5d
56e0842
08f50f3
4ceb532
b13bc3d
e370e41
2f4c54c
ede6619
 
8bfedc0
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
title: README
emoji: 🔥
colorFrom: purple
colorTo: purple
sdk: static
pinned: true
---

[These are my own quantizations (updated almost daily).](https://huggingface.co./spaces/RobertSinclair)  

The difference with normal quantizations is that I quantize the output and embed tensors to f16.  
and the other tensors to 15_k,q6_k or q8_0.  
This creates models that are little or not degraded at all and have a smaller size.  
They run at about 3-6 t/sec on CPU only using llama.cpp  
And obviously faster on computers with potent GPUs   

ALL the models were quantized in this way:  
```
python llama.cpp/convert_hf_to_gguf.py --outtype f16 model --outfile model.f16.gguf

quantize.exe --allow-requantize --output-tensor-type f16 --token-embedding-type f16 model.f16.gguf model.f16.q5.gguf q5_k  
quantize.exe --allow-requantize --output-tensor-type f16 --token-embedding-type f16 model.f16.gguf model.f16.q6.gguf q6_k  
quantize.exe --allow-requantize --output-tensor-type f16 --token-embedding-type f16 model.f16.gguf model.f16.q6.gguf q8_0  
quantize.exe --allow-requantize --pure model.f16.gguf model.f16.q8_p.gguf q8_0   
```

* [ZeroWw/Mistral-Nemo-Instruct-2407-GGUF](https://huggingface.co./ZeroWw/Mistral-Nemo-Instruct-2407-GGUF)
* [ZeroWw/L3-8B-Celeste-V1.2-GGUF](https://huggingface.co./ZeroWw/L3-8B-Celeste-V1.2-GGUF)
* [ZeroWw/xLAM-1b-fc-r-GGUF](https://huggingface.co./ZeroWw/xLAM-7b-fc-r-GGUF)
* [ZeroWw/xLAM-1b-fc-r-GGUF](https://huggingface.co./ZeroWw/xLAM-1b-fc-r-GGUF)
* [ZeroWw/Mistral-7B-Instruct-v0.3-GGUF](https://huggingface.co./ZeroWw/Mistral-7B-Instruct-v0.3-GGUF)
* [ZeroWw/L3-8b-Rosier-v1-GGUF](https://huggingface.co./ZeroWw/L3-8b-Rosier-v1-GGUF)
* [ZeroWw/llama3-turbcat-instruct-8b-GGUF](https://huggingface.co./ZeroWw/llama3-turbcat-instruct-8b-GGUF)
* [ZeroWw/L3-SthenoMaid-8B-V1-GGUF](https://huggingface.co./ZeroWw/L3-SthenoMaid-8B-V1-GGUF)
* [L3-8B-Celeste-v1-GGUF](https://huggingface.co./ZeroWw/L3-8B-Celeste-v1-GGUF)
* [ZeroWw/Gemmasutra-9B-v1b-GGUF](https://huggingface.co./ZeroWw/Gemmasutra-9B-v1b-GGUF)
* [ZeroWw/ghost-7b-alpha-GGUF](https://huggingface.co./ZeroWw/ghost-7b-alpha-GGUF)
* [ZeroWw/palmer-004-turbo-GGUF](https://huggingface.co./ZeroWw/palmer-004-turbo-GGUF)
* [ZeroWw/Hermes-2-Pro-Llama-3-8B-GGUF](https://huggingface.co./ZeroWw/Hermes-2-Pro-Llama-3-8B-GGUF)
* [ZeroWw/h2o-danube3-4b-chat-GGUF](https://huggingface.co./ZeroWw/h2o-danube3-4b-chat-GGUF)
* [ZeroWw/h2o-danube3-500m-chat-GGUF](https://huggingface.co./ZeroWw/h2o-danube3-500m-chat-GGUF)
* [ZeroWw/Smegmma-9B-v1-GGUF](https://huggingface.co./ZeroWw/Smegmma-9B-v1-GGUF)
* [ZeroWw/Mixtral_AI_Cyber_4.0-GGUF](https://huggingface.co./ZeroWw/Mixtral_AI_Cyber_4.0-GGUF)
* [ZeroWw/Llama-3-8B-Lexi-Uncensored-GGUF](https://huggingface.co./ZeroWw/Llama-3-8B-Lexi-Uncensored-GGUF)
* [ZeroWw/Tiger-Gemma-9B-v1-GGUF](https://huggingface.co./ZeroWw/Tiger-Gemma-9B-v1-GGUF)
* [ZeroWw/gpt2-xl-GGUF](https://huggingface.co./ZeroWw/gpt2-xl-GGUF)
* [ZeroWw/Arcee-Spark-GGUF](https://huggingface.co./ZeroWw/Arcee-Spark-GGUF)
* [ZeroWw/phillama-3.8b-v0.1-GGUF](https://huggingface.co./ZeroWw/phillama-3.8b-v0.1-GGUF)
* [ZeroWw/codegeex4-all-9b-GGUF](https://huggingface.co./ZeroWw/codegeex4-all-9b-GGUF)
* [ZeroWw/DeepSeek-V2-Lite-Chat-GGUF](https://huggingface.co./ZeroWw/DeepSeek-V2-Lite-Chat-GGUF)
* [ZeroWw/NuminaMath-7B-TIR-GGUF](https://huggingface.co./ZeroWw/NuminaMath-7B-TIR-GGUF)
* [ZeroWw/Phi-3-mini-128k-instruct-abliterated-v3-GGUF](https://huggingface.co./ZeroWw/Phi-3-mini-128k-instruct-abliterated-v3-GGUF)
* [ZeroWw/Phi-3-song-lyrics-1.0-GGUF](https://huggingface.co./ZeroWw/Phi-3-song-lyrics-1.0-GGUF)
* [ZeroWw/Meta-Llama-3-8B-Instruct-GGUF](https://huggingface.co./ZeroWw/Meta-Llama-3-8B-Instruct-GGUF)
* [ZeroWw/LLaMAX3-8B-Alpaca-GGUF](https://huggingface.co./ZeroWw/LLaMAX3-8B-Alpaca-GGUF)
* [ZeroWw/LLaMAX3-8B-GGUF](https://huggingface.co./ZeroWw/LLaMAX3-8B-GGUF)
* [ZeroWw/Moistral-11B-v3-GGUF](https://huggingface.co./ZeroWw/Moistral-11B-v3-GGUF)
* [ZeroWw/Moistral-11B-v4-GGUF](https://huggingface.co./ZeroWw/Moistral-11B-v4-GGUF)
* [ZeroWw/L3-Blackfall-Summanus-v0.1-15B-GGUF](https://huggingface.co./ZeroWw/L3-Blackfall-Summanus-v0.1-15B-GGUF)
* [ZeroWw/Smegmma-Deluxe-9B-v1-GGUF](https://huggingface.co./ZeroWw/Smegmma-Deluxe-9B-v1-GGUF)
* [ZeroWw/internlm2_5-7b-chat-GGUF](https://huggingface.co./ZeroWw/internlm2_5-7b-chat-GGUF)
* [ZeroWw/glm-4-9b-chat-GGUF](https://huggingface.co./ZeroWw/glm-4-9b-chat-GGUF)
* [ZeroWw/llama3-8B-DarkIdol-2.2-Uncensored-1048K-GGUF](https://huggingface.co./ZeroWw/llama3-8B-DarkIdol-2.2-Uncensored-1048K-GGUF)
* [ZeroWw/Gemma-2-9B-It-SPPO-Iter3-GGUF](https://huggingface.co./ZeroWw/Gemma-2-9B-It-SPPO-Iter3-GGUF)
* [ZeroWw/Phi-3-mini-4k-geminified-GGUF](https://huggingface.co./ZeroWw/Phi-3-mini-4k-geminified-GGUF)
* [ZeroWw/CodeQwen1.5-7B-Chat-GGUF](https://huggingface.co./ZeroWw/CodeQwen1.5-7B-Chat-GGUF)
* [ZeroWw/NeuralPipe-7B-slerp-GGUF](https://huggingface.co./ZeroWw/NeuralPipe-7B-slerp-GGUF)
* [ZeroWw/Llama-3-8B-Instruct-Gradient-4194k-GGUF](https://huggingface.co./ZeroWw/Llama-3-8B-Instruct-Gradient-4194k-GGUF)
* [ZeroWw/gemma-2-9b-it-GGUF](https://huggingface.co./ZeroWw/gemma-2-9b-it-GGUF)
* [ZeroWw/llama3-8B-DarkIdol-2.1-Uncensored-32K-GGUF](https://huggingface.co./ZeroWw/llama3-8B-DarkIdol-2.1-Uncensored-32K-GGUF)
* [ZeroWw/Meta-Llama-3-8B-Instruct-abliterated-v3-GGUF](https://huggingface.co./ZeroWw/Meta-Llama-3-8B-Instruct-abliterated-v3-GGUF)
* [ZeroWw/Hathor_Stable-v0.2-L3-8B-GGUF](https://huggingface.co./ZeroWw/Hathor_Stable-v0.2-L3-8B-GGUF)
* [ZeroWw/L3-Aethora-15B-V2-GGUF](https://huggingface.co./ZeroWw/L3-Aethora-15B-V2-GGUF)
* [ZeroWw/L3-8B-Stheno-v3.3-32K-GGUF](https://huggingface.co./ZeroWw/L3-8B-Stheno-v3.3-32K-GGUF)
* [ZeroWw/Llama-3-8B-Instruct-Gradient-1048k-GGUF](https://huggingface.co./ZeroWw/Llama-3-8B-Instruct-Gradient-1048k-GGUF)
* [ZeroWw/Pythia-Chat-Base-7B-GGUF](https://huggingface.co./ZeroWw/Pythia-Chat-Base-7B-GGUF)
* [ZeroWw/Yi-1.5-6B-Chat-GGUF](https://huggingface.co./ZeroWw/Yi-1.5-6B-Chat-GGUF)
* [ZeroWw/DeepSeek-Coder-V2-Lite-Base-GGUF](https://huggingface.co./ZeroWw/DeepSeek-Coder-V2-Lite-Base-GGUF)
* [ZeroWw/Yi-1.5-9B-32K-GGUF](https://huggingface.co./ZeroWw/Yi-1.5-9B-32K-GGUF)
* [ZeroWw/aya-23-8B-GGUF](https://huggingface.co./ZeroWw/aya-23-8B-GGUF)
* [ZeroWw/MixTAO-7Bx2-MoE-v8.1-GGUF](https://huggingface.co./ZeroWw/MixTAO-7Bx2-MoE-v8.1-GGUF)
* [ZeroWw/Phi-3-medium-128k-instruct-GGUF](https://huggingface.co./ZeroWw/Phi-3-medium-128k-instruct-GGUF)
* [ZeroWw/Phi-3-mini-128k-instruct-GGUF](https://huggingface.co./ZeroWw/Phi-3-mini-128k-instruct-GGUF)
* [ZeroWw/Qwen1.5-7B-Chat-GGUF](https://huggingface.co./ZeroWw/Qwen1.5-7B-Chat-GGUF)
* [ZeroWw/NeuralDaredevil-8B-abliterated-GGUF](https://huggingface.co./ZeroWw/NeuralDaredevil-8B-abliterated-GGUF)
* [ZeroWw/Mistroll-7B-v2.2-GGUF](https://huggingface.co./ZeroWw/Mistroll-7B-v2.2-GGUF)
* [ZeroWw/Samantha-Qwen-2-7B-GGUF](https://huggingface.co./ZeroWw/Samantha-Qwen-2-7B-GGUF)
* [ZeroWw/NSFW_DPO_Noromaid-7b-Mistral-7B-Instruct-v0.1-GGUF](https://huggingface.co./ZeroWw/NSFW_DPO_Noromaid-7b-Mistral-7B-Instruct-v0.1-GGUF)
* [ZeroWw/microsoft_WizardLM-2-7B-GGUF](https://huggingface.co./ZeroWw/microsoft_WizardLM-2-7B-GGUF)