bart_samsum / README.md
404sau404's picture
End of training
c7c8d8e verified
|
raw
history blame
1.97 kB
metadata
license: mit
base_model: facebook/bart-large-xsum
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: bart_samsum
    results: []

bart_samsum

This model is a fine-tuned version of facebook/bart-large-xsum on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4704
  • Rouge1: 54.8232
  • Rouge2: 30.1114
  • Rougel: 45.2666
  • Rougelsum: 50.7533
  • Gen Len: 30.3399

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 4
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.3807 0.9997 1841 1.5203 52.4158 27.5034 42.8274 48.0361 31.4664
1.077 2.0 3683 1.5038 53.5277 28.5946 44.2315 49.5696 30.768
0.831 2.9997 5524 1.5362 52.9008 27.7041 43.5637 48.3921 29.9243
0.6919 3.9989 7364 1.6272 52.8716 27.9183 43.8019 48.6547 30.2002

Framework versions

  • Transformers 4.42.4
  • Pytorch 2.3.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1