cgus
/

Apollo2-7B-iMat-GGUF

Question Answering

Inference Endpoints

Model card Files Files and versions Community

cgus commited on 8 days ago

Commit

8faf435

·

verified ·

1 Parent(s): 9a626bb

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -62,7 +62,8 @@ Made by: [FreedomIntelligence](https://huggingface.co/FreedomIntelligence)
 ## Quantization notes
 Made with llama.cpp-b3938 with imatrix file based on Exllamav2 callibration dataset.
 This model is meant to run with llama.cpp-compatible apps such as Text-Generation-WebUI, KoboldCpp, Jan, LM Studio and many many others.
-17.12.2024: Readme update. It seems Q4_0_4_4, Q4_0_4_8 and Q4_0_8_8 support was removed in recent llama.cpp. I'll keep them but they might be no longer useful.
 # Original model card
 # Democratizing Medical LLMs For Much More Languages

 ## Quantization notes
 Made with llama.cpp-b3938 with imatrix file based on Exllamav2 callibration dataset.
 This model is meant to run with llama.cpp-compatible apps such as Text-Generation-WebUI, KoboldCpp, Jan, LM Studio and many many others.
+17.12.2024: Readme update. It seems Q4_0_4_4, Q4_0_4_8 and Q4_0_8_8 support was removed in recent llama.cpp. I'll keep them but they might be no longer useful.
+03.02.2025: Added Q4_0 and IQ4_NL quants as a substitute for Q4_0_X_Y quants for ARM devices with newer llama.cpp versions.
 # Original model card
 # Democratizing Medical LLMs For Much More Languages