ldos's picture
End of training
7cd21b4
metadata
license: mit
base_model: facebook/bart-large-xsum
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: text_shortening_model_v40
    results: []

text_shortening_model_v40

This model is a fine-tuned version of facebook/bart-large-xsum on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.3335
  • Rouge1: 0.4511
  • Rouge2: 0.2377
  • Rougel: 0.4039
  • Rougelsum: 0.4038
  • Bert precision: 0.8635
  • Bert recall: 0.8629
  • Average word count: 8.5826
  • Max word count: 16
  • Min word count: 5
  • Average token count: 16.5616
  • % shortened texts with length > 12: 4.8048

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bert precision Bert recall Average word count Max word count Min word count Average token count % shortened texts with length > 12
3.0922 1.0 73 2.2144 0.4539 0.2272 0.4068 0.4055 0.8657 0.8684 8.7027 15 5 14.3423 4.2042
1.75 2.0 146 2.0055 0.4658 0.2381 0.4085 0.4088 0.8654 0.8656 8.7087 16 5 15.1652 4.8048
1.311 3.0 219 2.0021 0.456 0.2257 0.4124 0.4117 0.8644 0.8646 8.6396 15 5 15.9279 5.1051
1.0163 4.0 292 2.0698 0.467 0.2403 0.4159 0.4162 0.8636 0.8699 9.2973 16 5 17.2162 9.9099
0.8546 5.0 365 2.0707 0.4527 0.2392 0.4129 0.4126 0.8637 0.8647 8.4895 17 4 16.3153 4.8048
0.7222 6.0 438 2.1452 0.4562 0.2349 0.4077 0.4064 0.8693 0.8623 8.021 15 4 14.1051 1.2012
0.5723 7.0 511 2.3520 0.4563 0.2403 0.4142 0.413 0.8666 0.8658 8.5916 16 5 16.5465 6.9069
0.5274 8.0 584 2.2896 0.4502 0.2434 0.4077 0.4078 0.8639 0.8639 8.5586 14 5 14.8048 2.1021
0.3767 9.0 657 2.2928 0.4565 0.2368 0.4125 0.4114 0.8682 0.8623 8.0691 14 4 14.4204 1.8018
0.2987 10.0 730 2.5411 0.4539 0.2383 0.4057 0.4056 0.8652 0.8631 8.5826 15 5 15.6637 4.5045
0.2319 11.0 803 2.8995 0.4513 0.2367 0.4069 0.4068 0.8631 0.8622 8.6607 17 5 16.4535 5.7057
0.2167 12.0 876 2.7950 0.4632 0.2521 0.4163 0.4162 0.8673 0.8679 8.7267 16 4 16.3243 6.3063
0.1952 13.0 949 2.6240 0.4537 0.2396 0.406 0.4059 0.8632 0.8648 8.8258 18 5 16.2613 7.8078
0.1395 14.0 1022 2.8894 0.4588 0.2412 0.4141 0.4144 0.864 0.8658 8.6216 15 5 16.6426 3.6036
0.1298 15.0 1095 2.7580 0.4562 0.2384 0.4085 0.4088 0.8661 0.8659 8.5586 15 5 16.3634 5.4054
0.1044 16.0 1168 2.7724 0.466 0.2527 0.4175 0.4171 0.8677 0.8694 8.7387 15 4 16.4535 5.1051
0.0944 17.0 1241 2.9161 0.4429 0.232 0.3986 0.3986 0.8619 0.8621 8.6306 16 5 16.5255 5.4054
0.077 18.0 1314 3.1718 0.4549 0.2372 0.4054 0.4052 0.863 0.8639 8.6456 15 5 16.7447 4.8048
0.0561 19.0 1387 3.2650 0.4581 0.2413 0.4092 0.4089 0.866 0.865 8.5195 16 5 16.4174 4.8048
0.0542 20.0 1460 3.3335 0.4511 0.2377 0.4039 0.4038 0.8635 0.8629 8.5826 16 5 16.5616 4.8048

Framework versions

  • Transformers 4.33.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.13.3