Gemma 2: 2B QazPerry

Overview

Gemma 2: 2B QazPerry is a fine-tuned version of the Gemma 2B model, specifically optimized for the Kazakh language. This model is part of the QazPerry initiative, which aims to develop Small Large Language Models (SLLMs) to enhance Kazakh NLP capabilities.

Model Details

  • Base Model: Gemma 2B
  • Training Dataset: saillab/alpaca_kazakh_taco
  • Training Method: Fine-tuned with LoRA (Rank=64) and optimized using AdamW
  • License: MIT

Training Process

The model was fine-tuned on a dataset containing 30,000 Kazakh instruction-response pairs, ensuring accurate instruction-following performance. The training process included:

  • Token limit of 3,000 per response
  • LoRA adaptation for efficient fine-tuning
  • AdamW optimizer with learning rate of 5e-5
  • Batch size: 2
  • Training Epochs: 1

Usage

To use the model:

pip install keras-nlp huggingface_hub
from huggingface_hub import hf_hub_download
import keras
import keras_nlp

repo_id = "silvermete0r/Gemma2_2B_QazPerry"
filename = "Gemma2_2B_QazPerry.keras"

model_path = hf_hub_download(repo_id=repo_id, filename=filename)

gemma_lm = keras.models.load_model(model_path, custom_objects={"GemmaCausalLM": keras_nlp.models.GemmaCausalLM})

prompt = "Instruction:\nҚазақша бірдеңе айтшы?\n\nResponse:\n"

print(gemma_lm.generate(prompt))

Inference Example: https://www.kaggle.com/code/armanzhalgasbayev/gemma2-2b-qazperry-first-inference

Performance

The fine-tuned model demonstrates improved Kazakh language understanding, but further refinements are planned to enhance coherence and factual correctness.

Future Plans

  • Optimize inference speed for deployment in real-world applications

License

This model is released under the MIT License. Feel free to use, modify, and distribute it under the terms of the license.

Acknowledgements

Special thanks to Hugging Face, KerasNLP, and the Kazakh NLP community for their support in developing this model.

For more details, check out the Hugging Face Model Page.

Downloads last month
18
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the HF Inference API does not support keras models with pipeline type text-generation

Model tree for silvermete0r/Gemma2_2B_QazPerry

Base model

google/gemma-2-2b
Finetuned
(162)
this model

Dataset used to train silvermete0r/Gemma2_2B_QazPerry

Space using silvermete0r/Gemma2_2B_QazPerry 1