qwp4w3hyb
/

Phi-3-medium-128k-instruct-iMat-GGUF

Text Generation

importance matrix

Inference Endpoints

Model card Files Files and versions Community

qwp4w3hyb commited on May 21, 2024

Commit

6c17f44

·

verified ·

1 Parent(s): 9ccdc0e

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ tags:
 # Quant Infos
 - Not supported in llama.cpp master; Requires the latest version of the phi3 128k [branch](https://github.com/ggerganov/llama.cpp/pull/7225)
-- quants & imatrix are still in the oven will follow soon TM
 <!-- - quants done with an importance matrix for improved quantization loss -->
 <!-- - gguf & imatrix generated from bf16 for "optimal" accuracy loss (some say this is snake oil, but it can't hurt) -->
 <!-- - Wide coverage of different gguf quant types from Q\_8\_0 down to IQ1\_S -->

 # Quant Infos
 - Not supported in llama.cpp master; Requires the latest version of the phi3 128k [branch](https://github.com/ggerganov/llama.cpp/pull/7225)
+- just bf16 for now, quants & imatrix are still in the oven will follow soon TM
 <!-- - quants done with an importance matrix for improved quantization loss -->
 <!-- - gguf & imatrix generated from bf16 for "optimal" accuracy loss (some say this is snake oil, but it can't hurt) -->
 <!-- - Wide coverage of different gguf quant types from Q\_8\_0 down to IQ1\_S -->