chenghenry commited on
Commit
0fd5b3b
1 Parent(s): bd932ad

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -0
README.md ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gemma
3
+ library_name: transformers
4
+ base_model: google/gemma-2-27b-it
5
+ ---
6
+
7
+ ## Model
8
+ - Quantized Gemma 2 27B Instruction Tuned with IQ3_M
9
+ - Fit a single T4 (16GB)
10
+
11
+ ## Usage (llama-cli with GPU):
12
+ ```
13
+ llama-cli -m ./gemma-2-27b-it-IQ3_M.gguf -ngl 100 --temp 0 --repeat-penalty 1.0 --color -p "Why is the sky blue?"
14
+ ```
15
+
16
+ ## Usage (llama-cli with CPU):
17
+ ```
18
+ llama-cli -m ./gemma-2-27b-it-IQ3_M.gguf --temp 0 --repeat-penalty 1.0 --color -p "Why is the sky blue?"
19
+ ```
20
+
21
+ ## Usage (llama-cpp-python via Hugging Face Hub):
22
+ ```
23
+ from llama_cpp import Llama
24
+ llm = Llama.from_pretrained(
25
+ repo_id="chenghenry/gemma-2-27b-it-GGUF ",
26
+ filename="gemma-2-27b-it-IQ3_M.gguf",
27
+ n_ctx=8192,
28
+ n_batch=2048,
29
+ n_gpu_layers=100,
30
+ verbose=False,
31
+ chat_format="gemma"
32
+ )
33
+ prompt = "Why is the sky blue?"
34
+ messages = [{"role": "user", "content": prompt}]
35
+ response = llm.create_chat_completion(
36
+ messages=messages,
37
+ repeat_penalty=1.0,
38
+ temperature=0)
39
+ print(response["choices"][0]["message"]["content"])
40
+ ```