Gemma 2: 2B QazPerry

Overview

Gemma 2: 2B QazPerry is a fine-tuned version of the Gemma 2B model, specifically optimized for the Kazakh language. This model is part of the QazPerry initiative, which aims to develop Small Large Language Models (SLLMs) to enhance Kazakh NLP capabilities.

Model Details

Base Model: Gemma 2B
Training Dataset: saillab/alpaca_kazakh_taco
Training Method: Fine-tuned with LoRA (Rank=64) and optimized using AdamW
License: MIT

Training Process

The model was fine-tuned on a dataset containing 30,000 Kazakh instruction-response pairs, ensuring accurate instruction-following performance. The training process included:

Token limit of 3,000 per response
LoRA adaptation for efficient fine-tuning
AdamW optimizer with learning rate of 5e-5
Batch size: 2
Training Epochs: 1

Usage

To use the model:

pip install keras-nlp huggingface_hub

from huggingface_hub import hf_hub_download
import keras
import keras_nlp

repo_id = "silvermete0r/Gemma2_2B_QazPerry"
filename = "Gemma2_2B_QazPerry.keras"

model_path = hf_hub_download(repo_id=repo_id, filename=filename)

gemma_lm = keras.models.load_model(model_path, custom_objects={"GemmaCausalLM": keras_nlp.models.GemmaCausalLM})

prompt = "Instruction:\nҚазақша бірдеңе айтшы?\n\nResponse:\n"

print(gemma_lm.generate(prompt))

Inference Example: https://www.kaggle.com/code/armanzhalgasbayev/gemma2-2b-qazperry-first-inference

Performance

The fine-tuned model demonstrates improved Kazakh language understanding, but further refinements are planned to enhance coherence and factual correctness.

Future Plans

Optimize inference speed for deployment in real-world applications

License

This model is released under the MIT License. Feel free to use, modify, and distribute it under the terms of the license.

Acknowledgements

Special thanks to Hugging Face, KerasNLP, and the Kazakh NLP community for their support in developing this model.

For more details, check out the Hugging Face Model Page.

silvermete0r
/

Gemma2_2B_QazPerry