morgendigital
/

h2ogpt-4096-llama2-13b-chat-GGUF

Model card Files Files and versions Community

freefallr commited on Sep 4, 2023

Commit

d8abe61

•

1 Parent(s): 8f8cf7d

Update README.md

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -7,13 +7,14 @@ This is a quantized version of h2oai/h2ogpt-4096-llama2-13b-chat, formatted in G
 ## Available Formats
 ### GGUF
-| Name | Quant method | Bits | Use case |
-| ---- | ---- | ---- | ---- | ---- | ----- |
-| [morgendigital/h2ogpt-4096-llama2-13b-chat-GGUF.q8_0.bin](https://huggingface.co/TheBloke/vicuna-13B-v1.5-16K-GGML/blob/main/vicuna-13b-v1.5-16k.ggmlv3.q3_K_L.bin) | q3_K_L | 3 | New k-quant method. Uses GGML_TYPE_Q5_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else GGML_TYPE_Q3_K |
 ### Currently in conversion
-| Bits | Use case |
 | ---- | ---- | ----- |
 | q3_K_L | 3 | New k-quant method. Uses GGML_TYPE_Q5_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else GGML_TYPE_Q3_K |
 | q3_K_M | 3 | New k-quant method. Uses GGML_TYPE_Q4_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else GGML_TYPE_Q3_K |

 ## Available Formats
 ### GGUF
+| Format | Bits | Use case |
+| ---- | ---- | ----- |
+| q8_0 | 8 | Original quant method, 8-bit. |
 ### Currently in conversion
+| Format | Bits | Use case |
 | ---- | ---- | ----- |
 | q3_K_L | 3 | New k-quant method. Uses GGML_TYPE_Q5_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else GGML_TYPE_Q3_K |
 | q3_K_M | 3 | New k-quant method. Uses GGML_TYPE_Q4_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else GGML_TYPE_Q3_K |