Uploaded model
- Developed by: Ellight
- License: apache-2.0
- Finetuned from model : unsloth/gemma-7b-bnb-4bit
This gemma model was trained 2x faster with Unsloth and Huggingface's TRL library.
Hindi-Gemma-2B-instruct (Instruction-tuned)
Hindi-Gemma-2B-instruct is an instruction-tuned Hindi large language model (LLM) with 2 billion parameters, and it is based on Gemma 2B.
TO do inference using the LORA adapters
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "Ellight/gemma-2b-bnb-4bit", # YOUR MODEL YOU USED FOR TRAINING
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
prompt = """
Instruction:
{}
Response:
{}"""
inputs = tokenizer( [ prompt.format( "शतरंज बोर्ड पर कितने वर्ग होते हैं?", # instruction "", # output - leave this blank for generation! )
], return_tensors = "pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
tokenizer.batch_decode(outputs)
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for Ellight/gemma-2b-bnb-4bit
Base model
unsloth/gemma-7b-bnb-4bit