--- language: - dv base_model: - openai-community/gpt2 datasets: - wikimedia/wikipedia --- # GPT 2 DV base This is a GPT-2 model fine-tuned on Dhivehi language texts. The model was trained on a curated dataset of Dhivehi Wikipedia articles and can be used for text generation in the Dhivehi language. ## Model Description - **Model Type:** GPT-2 - **Language:** Dhivehi (ދިވެހި) - **Training Data:** Dhivehi Wikipedia articles - **Last Updated:** 2024-11-25 ## Performance Metrics Evaluation metrics on the test set: - Average Perplexity: 3.80 - Perplexity Std: 2.23 - Best Perplexity: 2.72 ## Usage Example ```python from transformers import GPT2LMHeadModel, GPT2TokenizerFast # Load model and tokenizer model = GPT2LMHeadModel.from_pretrained("alakxender/dhivehi-gpt2-base") tokenizer = GPT2TokenizerFast.from_pretrained("alakxender/dhivehi-gpt2-base") # Prepare your prompt prompt = "ދިވެހިރާއްޖެއަކީ" inputs = tokenizer(prompt, return_tensors="pt") # Generate text outputs = model.generate( **inputs, max_length=200, temperature=0.7, top_p=0.9, do_sample=True, num_return_sequences=1 ) # Decode the generated text generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) print(generated_text) ``` ## Training Details The model was trained using the following configuration: - Base model: GPT-2 - Training type: Full fine-tuning - Mixed precision: FP16 - Gradient checkpointing: Enabled ### Hyperparameters: - Learning rate: 5e-5 - Batch size: 32 - Gradient accumulation steps: 2 - Epochs: 3 - Weight decay: 0.01 - Warmup steps: 1000 ## Limitations - Primary training data is from Wikipedia, which may not cover all Dhivehi language contexts - May not perform well on specialized or technical content - Could reflect biases present in the training data - Not recommended for production use without thorough evaluation ## Intended Uses This model is suitable for: - Dhivehi text generation - Research on Dhivehi NLP - Educational purposes - Experimental applications Not intended for: - Critical or production systems - Decision-making applications - Tasks requiring factual accuracy