christinacdl
/

XLM_RoBERTa-Multilingual-Clickbait-Detection

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

XLM_RoBERTa-Multilingual-Clickbait-Detection

This model is a fine-tuned version of xlm-roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.2192
Micro F1: 0.9759
Macro F1: 0.9758
Accuracy: 0.9759

Test Set Macro-F1 scores

Multilingual test set: 97.28
en test set: 97.83
el test set: 97.32
it test set: 97.54
es test set: 97.67
ro test set: 97.40
de test set: 97.40
fr test set: 96.90
pl test set: 96.18

Intended uses & limitations

This model will be employed for an EU project.

Training and evaluation data

The "clickbait_detection_dataset" was translated from English to Greek, Italian, Spanish, Romanian, French and German using the Opus-mt.
The dataset was also translated from English to Polish using the M2M NMT.
The "EasyNMT" library was utilized to employ the NMT models.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 4

Framework versions

Transformers 4.36.1
Pytorch 2.1.0+cu121
Datasets 2.13.1
Tokenizers 0.15.0

Downloads last month: 183

Safetensors

Model size

560M params

Tensor type

F32

·

Inference Examples

Text Classification

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for christinacdl/XLM_RoBERTa-Multilingual-Clickbait-Detection

Base model

FacebookAI/xlm-roberta-large

Finetuned

(346)

this model

Quantizations

1 model

Dataset used to train christinacdl/XLM_RoBERTa-Multilingual-Clickbait-Detection

Evaluation results

Metadata error: specify a dataset to view leaderboard