Mikask's picture
Update README.md
f450ccc verified
metadata
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - precision
  - recall
  - f1
model-index:
  - name: roberta-base-mr-6000ar
    results: []

roberta-base-mr-6000ar

This model was trained from scratch on the Internal Selection for BDC Satria Data 2024 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0515
  • Accuracy: 0.9413
  • Precision: 0.9643
  • Recall: 0.9265
  • F1: 0.9450

Model description

Training dataset was augmented with the paraphrasing method to generate 6000 extra data.

Intended uses & limitations

This model was not the model used for the final submission on the internal selection.

Training and evaluation data

The training dataset had 1500 rows of data, and an extra 6000 augmented data. The evaluation dataset had 500 rows of data.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Accuracy Precision Recall F1
0.0185 1.0 821 0.0800 0.9173 0.8879 0.9706 0.9274
0.0121 2.0 1642 0.0789 0.9147 0.9778 0.8627 0.9167
0.0101 3.0 2463 0.0515 0.9413 0.9643 0.9265 0.9450

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1