File size: 2,142 Bytes
91d966c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
---
datasets:
- marcuskd/reviews_binary_not4_concat
language:
- 'no'
- nb
- nn
metrics:
- accuracy
- recall
- precision
- f1
---
# Model Card for Model ID
Sentiment analysis for Norwegian reviews.
# Model Description
This model is trained using a self-concatinated dataset consisting of Norwegian Review Corpus dataset (https://github.com/ltgoslo/norec) and a sentiment dataset from huggingface (https://huggingface.co./datasets/sepidmnorozy/Norwegian_sentiment).
Its purpose is merely for testing.
- **Developed by:** Simen Aabol and Marcus Dragsten
- **Finetuned from model:** norbert2
# Direct Use
Plug in Norwegian sentences to check its sentiment (negative to positive)
# Training Details
## Training and Testing Data
<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
https://huggingface.co./datasets/marcuskd/reviews_binary_not4_concat
### Preprocessing
Tokenized using:
```python
tokenizer = AutoTokenizer.from_pretrained("ltgoslo/norbert2")
```
Training arguments for this model:
```python
training_args = TrainingArguments(
output_dir='./results', # output directory
num_train_epochs=10, # total number of training epochs
per_device_train_batch_size=16, # batch size per device during training
per_device_eval_batch_size=64, # batch size for evaluation
warmup_steps=500, # number of warmup steps for learning rate scheduler
weight_decay=0.01, # strength of weight decay
logging_dir='./logs', # directory for storing logs
logging_steps=10,
)
```
# Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
Evaluation by testing using test-split of dataset.
```python
{
'accuracy': 0.8357214261912695,
'recall': 0.886873508353222,
'precision': 0.8789025543992431,
'f1': 0.8828700403896412,
'total_time_in_seconds': 94.33071640000003,
'samples_per_second': 31.81360340013276,
'latency_in_seconds': 0.03143309443518828
}
``` |