Gemma2-2B-Swahili-IT

Gemma2-2B-Swahili-IT is a lightweight, efficient open variant of Google's Gemma2-2B-IT model, fine-tuned for natural Swahili language understanding and generation. This model provides a resource-efficient option for Swahili language tasks while maintaining strong performance.

Model Details

  • Developer: Alfaxad Eyembe
  • Base Model: google/gemma-2-2b-it
  • Model Type: Decoder-only transformer
  • Language(s): Swahili
  • License: Apache 2.0
  • Finetuning Approach: Low-Rank Adaptation (LoRA)

Training Data

The model was fine-tuned on a comprehensive dataset containing:

  • 67,017 instruction-response pairs
  • 16,273,709 total tokens
  • Average 242.83 tokens per example
  • High-quality, naturally-written Swahili content

Performance

Massive Multitask Language Understanding (MMLU) - Swahili

  • Base Model: 31.58% accuracy
  • Fine-tuned Model: 38.60% accuracy
  • Improvement: +7.02%

Sentiment Analysis

  • Base Model: 84.85% accuracy
  • Fine-tuned Model: 86.00% accuracy
  • Improvement: +1.15%
  • Response Validity: 100%

Intended Use

This model is designed for:

  • Basic Swahili text generation
  • Question answering
  • Sentiment analysis
  • Simple creative writing
  • General instruction following in Swahili
  • Resource-constrained environments

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("alfaxadeyembe/gemma2-2b-swahili-it")
model = AutoModelForCausalLM.from_pretrained(
    "alfaxadeyembe/gemma2-2b-swahili-it",
    device_map="auto",
    torch_dtype=torch.bfloat16
)

# Always set to eval mode for inference
model.eval()

# Example usage
prompt = "Eleza dhana ya uchumi wa kidijitali na umuhimu wake katika ulimwengu wa leo."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=500,
        do_sample=True,
        temperature=0.7,
        top_p=0.95
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Training Details

  • Fine-tuning Method: LoRA
  • Training Steps: 400
  • Batch Size: 2
  • Gradient Accumulation Steps: 32
  • Learning Rate: 2e-4
  • Training Time: ~8 hours on A100 GPU

Key Features

  • Lightweight and efficient (2B parameters)
  • Suitable for resource-constrained environments
  • Good performance on basic language tasks
  • Fast inference speed
  • Low memory footprint

Advantages

  1. Resource Efficiency:

    • Small model size (2B parameters)
    • Lower memory requirements
    • Faster inference time
    • Suitable for deployment on less powerful hardware
  2. Task Performance:

    • Strong sentiment analysis capabilities
    • Decent MMLU performance
    • Good instruction following
    • Natural Swahili generation

Limitations

  • Simpler responses compared to 9B/27B variants

Citation

@misc{gemma2-2b-swahili-it,
  author = {Alfaxad Eyembe},
  title = {Gemma2-2B-Swahili-IT: A Lightweight Swahili Variant of Gemma2-2B-IT},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
}

Contact

For questions or feedback, please reach out through:

Downloads last month
0
Safetensors
Model size
2.61B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Alfaxad/gemma2-2b-swahili-it

Base model

google/gemma-2-2b
Finetuned
(134)
this model