BART-CNN-Convosumm / README.md
Remeris's picture
End of training
c2812e5
|
raw
history blame
2.3 kB
metadata
license: mit
base_model: facebook/bart-large-cnn
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: BART-CNN-Convosumm
    results: []

BART-CNN-Convosumm

This model is a fine-tuned version of facebook/bart-large-cnn on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.8797
  • Rouge1: 38.6252
  • Rouge2: 12.2556
  • Rougel: 23.902
  • Rougelsum: 34.6324
  • Gen Len: 81.28

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 20
  • total_train_batch_size: 20
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: polynomial
  • lr_scheduler_warmup_steps: 1
  • num_epochs: 7
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
6.207 1.0 10 4.2651 32.3341 7.812 20.0411 29.4849 77.38
4.0248 1.99 20 3.9903 36.0787 11.0447 21.3596 33.2903 130.58
3.5933 2.99 30 3.9020 34.2931 11.2036 20.7935 30.8361 140.02
3.3086 3.98 40 3.8712 38.4842 11.9947 23.4913 34.4347 85.78
3.112 4.98 50 3.8700 38.652 11.8315 23.5208 34.5998 76.2
2.9933 5.97 60 3.8809 38.66 12.3337 23.4394 35.1976 83.26
2.834 6.97 70 3.8797 38.6252 12.2556 23.902 34.6324 81.28

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.15.0