valenradovich
/

gemma-2-9b-it-IQ1_M-gguf

Inference Endpoints

Model card Files Files and versions Community

valenradovich commited on Jul 7, 2024

Commit

12afec6

·

verified ·

1 Parent(s): 7517fd7

Create README.md

Files changed (1) hide show

README.md +17 -0

README.md ADDED Viewed

	@@ -0,0 +1,17 @@

+---
+license: apache-2.0
+---
+# gemma-2-9B-it-iq1_m
+This is a quantized version of the Gemma2 9B instruct model using the IQ1_M quantization method.
+## Model Details
+- **Original Model**: [Gemma2-9B-it](https://huggingface.co/google/gemma-2-9b-it)
+- **Quantization Method**: IQ1_M
+- **Precision**: 1-bit
+- **iMatrix**: From [bartowski](https://huggingface.co/bartowski) [gemma-2-9b-it-gguf repo](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/tree/main)
+## Usage
+You can use it directly with llama.cpp