flan-t5-base-dialogsum_v2
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.9323
- Rouge1: 49.6194
- Rouge2: 23.9441
- Rougel: 47.1784
- Rougelsum: 47.7351
- Gen Len: 19.0
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 25
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.8419 | 1.0 | 1558 | 0.9686 | 49.1717 | 21.6321 | 46.5027 | 46.9099 | 19.0 |
0.7931 | 2.0 | 3116 | 0.9393 | 49.0429 | 25.016 | 46.8583 | 47.1894 | 19.0 |
0.7931 | 3.0 | 4674 | 0.9555 | 49.969 | 24.7039 | 47.6287 | 48.1426 | 19.0 |
0.7572 | 4.0 | 6232 | 0.9746 | 49.1309 | 23.2537 | 46.4771 | 46.9249 | 19.0 |
0.7417 | 5.0 | 7790 | 0.9820 | 49.2093 | 23.1137 | 46.6788 | 47.1939 | 19.0 |
0.7491 | 6.0 | 9348 | 0.9592 | 50.4655 | 25.5968 | 48.0925 | 48.6231 | 19.0 |
0.7181 | 7.0 | 10906 | 0.9593 | 50.6565 | 24.5502 | 48.0572 | 48.5985 | 19.0 |
0.7144 | 8.0 | 12464 | 0.9386 | 50.9351 | 25.9342 | 48.4183 | 48.898 | 19.0 |
0.7137 | 9.0 | 14022 | 0.9486 | 49.5404 | 23.7464 | 46.9305 | 47.4827 | 19.0 |
0.6728 | 10.0 | 15580 | 0.9419 | 49.9498 | 23.8617 | 47.3421 | 47.9081 | 19.0 |
0.6783 | 11.0 | 17138 | 0.9459 | 50.52 | 25.9799 | 48.0356 | 48.596 | 19.0 |
0.6708 | 12.0 | 18696 | 0.9552 | 49.85 | 23.8992 | 47.3564 | 47.8535 | 19.0 |
0.6515 | 13.0 | 20254 | 0.9462 | 49.7239 | 25.468 | 47.4924 | 48.0146 | 19.0 |
0.6419 | 14.0 | 21812 | 0.9507 | 49.8791 | 25.3529 | 47.5795 | 48.0845 | 19.0 |
0.6297 | 15.0 | 23370 | 0.9323 | 49.6194 | 23.9441 | 47.1784 | 47.7351 | 19.0 |
0.6354 | 16.0 | 24928 | 0.9408 | 50.3383 | 25.6301 | 48.1241 | 48.6641 | 19.0 |
0.6178 | 17.0 | 26486 | 0.9420 | 49.5266 | 24.6134 | 47.1452 | 47.6481 | 19.0 |
0.608 | 18.0 | 28044 | 0.9549 | 49.7907 | 24.9381 | 47.4622 | 48.0312 | 19.0 |
0.6205 | 19.0 | 29602 | 0.9580 | 49.687 | 25.1737 | 47.4014 | 47.8466 | 19.0 |
0.5946 | 20.0 | 31160 | 0.9519 | 50.1159 | 25.3671 | 47.8232 | 48.3411 | 19.0 |
0.6107 | 21.0 | 32718 | 0.9549 | 49.9508 | 24.746 | 47.5774 | 48.1047 | 19.0 |
0.591 | 22.0 | 34276 | 0.9543 | 50.1561 | 25.3451 | 47.8341 | 48.3492 | 19.0 |
0.6017 | 23.0 | 35834 | 0.9570 | 49.893 | 25.1383 | 47.503 | 48.059 | 19.0 |
0.5942 | 24.0 | 37392 | 0.9584 | 49.9479 | 24.9422 | 47.5917 | 48.1012 | 19.0 |
0.5826 | 25.0 | 38950 | 0.9592 | 49.9049 | 25.1102 | 47.5736 | 48.1165 | 19.0 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.0.1+cu117
- Datasets 2.15.0
- Tokenizers 0.15.0
- Downloads last month
- 7
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.