File size: 1,016 Bytes
0fd5b3b
 
 
 
 
 
 
 
 
 
 
 
7e4f32e
0fd5b3b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
license: gemma
library_name: transformers
base_model: google/gemma-2-27b-it
---

## Model
- Quantized Gemma 2 27B Instruction Tuned with IQ3_M
- Fit a single T4 (16GB)

## Usage (llama-cli with GPU): 
```
llama-cli -m ./gemma-2-27b-it-IQ3_M.gguf -ngl 42 --temp 0 --repeat-penalty 1.0 --color -p "Why is the sky blue?"
```

## Usage (llama-cli with CPU): 
```
llama-cli -m ./gemma-2-27b-it-IQ3_M.gguf --temp 0 --repeat-penalty 1.0 --color -p "Why is the sky blue?"
```

## Usage (llama-cpp-python via Hugging Face Hub): 
```
from llama_cpp import Llama
llm = Llama.from_pretrained(
    repo_id="chenghenry/gemma-2-27b-it-GGUF ",
    filename="gemma-2-27b-it-IQ3_M.gguf",
    n_ctx=8192,
    n_batch=2048,
    n_gpu_layers=100,
    verbose=False,
    chat_format="gemma"
)
prompt = "Why is the sky blue?"
messages = [{"role": "user", "content": prompt}]
response = llm.create_chat_completion(
    messages=messages,
    repeat_penalty=1.0,
    temperature=0)
print(response["choices"][0]["message"]["content"])
```