Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,60 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: ha # Hausa language code
|
3 |
+
tags:
|
4 |
+
- sentiment-analysis
|
5 |
+
- hausa
|
6 |
+
- social-media
|
7 |
+
- transformers
|
8 |
+
- bert
|
9 |
+
license: apache-2.0
|
10 |
+
---
|
11 |
+
|
12 |
+
# Hausa Sentiment Analysis
|
13 |
+
|
14 |
+
This model is a fine-tuned version of `bert-base-cased` designed for sentiment analysis of Hausa text data. The model is specifically trained to classify social media text (tweets) into different sentiment categories.
|
15 |
+
|
16 |
+
## Model Description
|
17 |
+
|
18 |
+
**Hausa Sentiment Analysis** is a BERT-based model fine-tuned for analyzing the sentiment of Hausa language social media text. The model was trained on 35,000 examples collected from various social media platforms, making it suitable for sentiment analysis tasks in Hausa.
|
19 |
+
|
20 |
+
## Intended Uses & Limitations
|
21 |
+
|
22 |
+
- **Intended Use**: Sentiment analysis of social media texts in the Hausa language.
|
23 |
+
- **Primary Use Cases**: Monitoring and analyzing public sentiment on social media platforms, academic research in natural language processing (NLP) for low-resource languages.
|
24 |
+
- **Limitations**: May not perform well on text outside the social media domain or with dialectal variations.
|
25 |
+
|
26 |
+
## Training Data
|
27 |
+
|
28 |
+
- **Data Source**: Collected from social media platforms.
|
29 |
+
- **Number of Examples**: 35,000
|
30 |
+
- **Preprocessing**: Text normalization, tokenization.
|
31 |
+
|
32 |
+
## Training Procedure
|
33 |
+
|
34 |
+
- **Training Script**: Used the Hugging Face `Trainer` API.
|
35 |
+
- **Hyperparameters**:
|
36 |
+
- Epochs: 40
|
37 |
+
- Batch Size (Train): 32
|
38 |
+
- Batch Size (Eval): 64
|
39 |
+
- Warmup Steps: 10
|
40 |
+
- Weight Decay: 0.01
|
41 |
+
- Logging Steps: 200
|
42 |
+
|
43 |
+
## Evaluation
|
44 |
+
|
45 |
+
- **Evaluation Metrics**: Accuracy, Precision, Recall, F1-score.
|
46 |
+
- **Results**: The model achieved high performance on the validation set, indicating strong capability in handling Hausa social media sentiment analysis tasks.
|
47 |
+
|
48 |
+
## How to Use
|
49 |
+
|
50 |
+
To use this model for sentiment analysis, you can load it using the `transformers` library:
|
51 |
+
|
52 |
+
```python
|
53 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
54 |
+
|
55 |
+
tokenizer = AutoTokenizer.from_pretrained("Kumshe/Hausa-sentiment-analysis")
|
56 |
+
model = AutoModelForSequenceClassification.from_pretrained("Kumshe/Hausa-sentiment-analysis")
|
57 |
+
|
58 |
+
# Example usage
|
59 |
+
inputs = tokenizer("This is an example tweet in Hausa language", return_tensors="pt")
|
60 |
+
outputs = model(**inputs)
|