Gemma 2: 2B QazPerry
Overview
Gemma 2: 2B QazPerry is a fine-tuned version of the Gemma 2B model, specifically optimized for the Kazakh language. This model is part of the QazPerry initiative, which aims to develop Small Large Language Models (SLLMs) to enhance Kazakh NLP capabilities.
Model Details
- Base Model: Gemma 2B
- Training Dataset: saillab/alpaca_kazakh_taco
- Training Method: Fine-tuned with LoRA (Rank=64) and optimized using AdamW
- License: MIT
Training Process
The model was fine-tuned on a dataset containing 30,000 Kazakh instruction-response pairs, ensuring accurate instruction-following performance. The training process included:
- Token limit of 3,000 per response
- LoRA adaptation for efficient fine-tuning
- AdamW optimizer with learning rate of 5e-5
- Batch size: 2
- Training Epochs: 1
Usage
To use the model:
pip install keras-nlp huggingface_hub
from huggingface_hub import hf_hub_download
import keras
import keras_nlp
repo_id = "silvermete0r/Gemma2_2B_QazPerry"
filename = "Gemma2_2B_QazPerry.keras"
model_path = hf_hub_download(repo_id=repo_id, filename=filename)
gemma_lm = keras.models.load_model(model_path, custom_objects={"GemmaCausalLM": keras_nlp.models.GemmaCausalLM})
prompt = "Instruction:\nҚазақша бірдеңе айтшы?\n\nResponse:\n"
print(gemma_lm.generate(prompt))
Inference Example: https://www.kaggle.com/code/armanzhalgasbayev/gemma2-2b-qazperry-first-inference
Performance
The fine-tuned model demonstrates improved Kazakh language understanding, but further refinements are planned to enhance coherence and factual correctness.
Future Plans
- Optimize inference speed for deployment in real-world applications
License
This model is released under the MIT License. Feel free to use, modify, and distribute it under the terms of the license.
Acknowledgements
Special thanks to Hugging Face, KerasNLP, and the Kazakh NLP community for their support in developing this model.
For more details, check out the Hugging Face Model Page.
- Downloads last month
- 18