XLM_RoBERTa-Multilingual-Clickbait-Detection
This model is a fine-tuned version of xlm-roberta-large on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.2192
- Micro F1: 0.9759
- Macro F1: 0.9758
- Accuracy: 0.9759
Test Set Macro-F1 scores
- Multilingual test set: 97.28
- en test set: 97.83
- el test set: 97.32
- it test set: 97.54
- es test set: 97.67
- ro test set: 97.40
- de test set: 97.40
- fr test set: 96.90
- pl test set: 96.18
Intended uses & limitations
- This model will be employed for an EU project.
Training and evaluation data
- The "clickbait_detection_dataset" was translated from English to Greek, Italian, Spanish, Romanian, French and German using the Opus-mt.
- The dataset was also translated from English to Polish using the M2M NMT.
- The "EasyNMT" library was utilized to employ the NMT models.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4
Framework versions
- Transformers 4.36.1
- Pytorch 2.1.0+cu121
- Datasets 2.13.1
- Tokenizers 0.15.0
- Downloads last month
- 183
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.