--- license: gemma library_name: transformers base_model: google/gemma-2-27b-it --- ## Model - Quantized Gemma 2 27B Instruction Tuned with IQ3_M - Fit a single T4 (16GB) ## Usage (llama-cli with GPU): ``` llama-cli -m ./gemma-2-27b-it-IQ3_M.gguf -ngl 42 --temp 0 --repeat-penalty 1.0 --color -p "Why is the sky blue?" ``` ## Usage (llama-cli with CPU): ``` llama-cli -m ./gemma-2-27b-it-IQ3_M.gguf --temp 0 --repeat-penalty 1.0 --color -p "Why is the sky blue?" ``` ## Usage (llama-cpp-python via Hugging Face Hub): ``` from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="chenghenry/gemma-2-27b-it-GGUF ", filename="gemma-2-27b-it-IQ3_M.gguf", n_ctx=8192, n_batch=2048, n_gpu_layers=100, verbose=False, chat_format="gemma" ) prompt = "Why is the sky blue?" messages = [{"role": "user", "content": prompt}] response = llm.create_chat_completion( messages=messages, repeat_penalty=1.0, temperature=0) print(response["choices"][0]["message"]["content"]) ```