arthurmluz's picture
Model save
7fef12b
|
raw
history blame
4.66 kB
metadata
license: mit
base_model: unicamp-dl/ptt5-base-portuguese-vocab
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: ptt5-wikilingua-1024
    results: []

ptt5-wikilingua-1024

This model is a fine-tuned version of unicamp-dl/ptt5-base-portuguese-vocab on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8346
  • Rouge1: 26.0293
  • Rouge2: 11.2397
  • Rougel: 22.2357
  • Rougelsum: 25.393
  • Gen Len: 18.4771

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Gen Len Validation Loss Rouge1 Rouge2 Rougel Rougelsum
2.0816 1.0 28580 18.168 1.9680 23.8868 9.5463 20.5459 23.3329
1.9469 2.0 57160 18.3899 1.9000 24.7191 10.1206 21.165 24.1349
1.9482 3.0 85740 18.3192 1.8655 24.9016 10.3913 21.325 24.3342
1.808 4.0 114320 18.3628 1.8422 25.3346 10.7062 21.6946 24.764
1.7811 5.0 142900 18.2808 1.8304 25.3047 10.7773 21.7447 24.7572
1.7676 6.0 171480 18.3839 1.8161 25.5816 10.9429 21.9072 24.9958
1.6651 7.0 200060 18.3506 1.8081 25.5281 10.9006 21.8813 24.9432
1.6461 8.0 228640 18.39 1.8047 25.6912 10.9803 21.9881 25.1059
1.6942 9.0 257220 18.3609 1.8004 25.7941 11.0952 22.1048 25.2158
1.6389 10.0 285800 18.3792 1.7971 25.8327 11.1257 22.1268 25.2338
1.6152 11.0 314380 18.4059 1.7964 25.7519 11.1059 22.1061 25.178
1.6127 12.0 342960 18.3953 1.7974 25.9198 11.218 22.2459 25.3411
1.5946 13.0 371540 18.4025 1.8020 26.0687 11.3053 22.3127 25.4836
1.5988 14.0 400120 18.4376 1.8034 25.9518 11.1943 22.233 25.3327
1.5474 15.0 428700 18.4397 1.8008 26.0176 11.2425 22.2723 25.4065
1.5135 16.0 457280 18.441 1.7997 26.0409 11.2593 22.2739 25.4333
1.563 17.0 485860 18.4556 1.8130 26.0385 11.2479 22.2757 25.4155
1.4997 18.0 514440 18.4048 1.8098 25.9907 11.2433 22.2378 25.3589
1.4414 19.0 543020 18.4738 1.8161 26.0156 11.209 22.2514 25.3623
1.4487 20.0 571600 18.4353 1.8128 26.0583 11.2856 22.2673 25.4279
1.4434 21.0 600180 18.448 1.8189 25.9673 11.2448 22.1904 25.3287
1.4699 22.0 628760 18.4698 1.8188 26.0581 11.288 22.2603 25.4347
1.4282 23.0 657340 18.4548 1.8235 25.9654 11.1782 22.2008 25.3327
1.4411 24.0 685920 18.4547 1.8265 26.1178 11.3101 22.3081 25.474
1.3912 25.0 714500 1.8309 26.0667 11.2725 22.2863 25.4394 18.4705
1.4061 26.0 743080 1.8309 26.0472 11.2591 22.2589 25.4179 18.4803
1.4594 27.0 771660 1.8289 26.0164 11.2367 22.2239 25.3929 18.4811
1.3836 28.0 800240 1.8323 26.0416 11.2521 22.2303 25.4106 18.4734
1.4051 29.0 828820 1.8349 26.0081 11.2332 22.213 25.3822 18.4797
1.3833 30.0 857400 1.8346 26.0293 11.2397 22.2357 25.393 18.4771

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.14.1