metadata
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- gemma
- trl
base_model: google/gemma-7b
Uploaded model
- Developed by: saucam
- License: apache-2.0
- Finetuned from model : google/gemma-7b
This is a finetuned version of gemma-7b on sarvamai/samvaad-hi-v1 hindi dataset using chatml format.
Inference
We can use unsloth for fast inference
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
from unsloth.chat_templates import get_chat_template
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "saucam/gemma-samvaad-7b", # YOUR MODEL YOU USED FOR TRAINING
max_seq_length = 2048,
dtype = None,
load_in_4bit = False,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
tokenizer = get_chat_template(
tokenizer,
chat_template = "chatml",
map_eos_token = True, # Maps <|im_end|> to </s> instead
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
messages = [
{"role": "user", "content": "(9+1)+(5+0). इसे 3 चरणों में हल करें."},
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize = True,
add_generation_prompt = True, # Must add for generation
return_tensors = "pt",
).to("cuda")
outputs = model.generate(input_ids = inputs, max_new_tokens = 512, use_cache = True)
out = tokenizer.batch_decode(outputs)
print(out)
['<bos><|im_start|>user\n(9+1)+(5+0). इसे 3 चरणों में हल करें.<|im_end|>\n
<|im_start|>assistant\n(9+1)+(5+0) को 3 चरणों में हल करने के लिए, हम इसे छोटे भागों में विभाजित कर सकते हैं। पहले चरण में, हम 9 को 1 से जोड़ते हैं, जो 10 देता है। दूसरे चरण में, हम 5 को 0 से जोड़ते हैं, जो 5 देता है। तीसरे चरण में, हम 10 को 5 से जोड़ते हैं, जो 15 देता है। इसलिए, (9+1)+(5+0) का परिणाम 15 है।<|im_end|>
This gemma model was trained 2x faster with Unsloth and Huggingface's TRL library.