fine-tuned-flan-t5 / README.md
tanatapanun's picture
Update README.md
92f313d
metadata
license: apache-2.0
base_model: google/flan-t5-small
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: fine-tune-llama2
    results: []

fine-tune-flan-t5

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7668
  • Rouge1: 0.267
  • Rouge2: 0.1008
  • Rougel: 0.2321
  • Rougelsum: 0.2335
  • Gen Len: 19.36

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 301 0.8071 0.2304 0.0799 0.1958 0.1966 19.42
0.8375 2.0 602 0.8009 0.2368 0.0763 0.1966 0.1978 19.41
0.8375 3.0 903 0.7935 0.2496 0.08 0.2114 0.2119 19.57
0.8146 4.0 1204 0.7900 0.2556 0.0874 0.217 0.2178 19.55
0.7923 5.0 1505 0.7893 0.2565 0.089 0.2145 0.2157 19.46
0.7923 6.0 1806 0.7846 0.2512 0.087 0.2134 0.2139 19.35
0.7801 7.0 2107 0.7845 0.2499 0.082 0.2082 0.209 19.34
0.7801 8.0 2408 0.7792 0.2464 0.0822 0.2102 0.2105 19.52
0.7589 9.0 2709 0.7769 0.2433 0.0774 0.2049 0.206 19.32
0.7457 10.0 3010 0.7746 0.2459 0.0767 0.2073 0.2082 19.32
0.7457 11.0 3311 0.7740 0.245 0.0817 0.2086 0.2087 19.32
0.7495 12.0 3612 0.7736 0.2491 0.0824 0.2138 0.2144 19.31
0.7495 13.0 3913 0.7716 0.2541 0.0876 0.2172 0.2182 19.29
0.7341 14.0 4214 0.7694 0.2654 0.096 0.2274 0.2289 19.34
0.711 15.0 4515 0.7677 0.2672 0.1054 0.23 0.2312 19.35
0.711 16.0 4816 0.7698 0.2774 0.106 0.2395 0.2412 19.34
0.7158 17.0 5117 0.7693 0.269 0.0973 0.2301 0.2317 19.5
0.7158 18.0 5418 0.7683 0.2696 0.1062 0.233 0.2342 19.3
0.7016 19.0 5719 0.7694 0.2601 0.0971 0.2258 0.2271 19.29
0.6977 20.0 6020 0.7687 0.269 0.0999 0.2328 0.2341 19.33
0.6977 21.0 6321 0.7676 0.2658 0.099 0.2297 0.2304 19.32
0.7028 22.0 6622 0.7674 0.2654 0.0986 0.2299 0.2308 19.29
0.7028 23.0 6923 0.7676 0.2666 0.0993 0.231 0.2328 19.34
0.6698 24.0 7224 0.7667 0.2658 0.0989 0.2296 0.2314 19.35
0.6956 25.0 7525 0.7670 0.2669 0.0992 0.2295 0.2308 19.35
0.6956 26.0 7826 0.7669 0.2618 0.0966 0.2259 0.2269 19.31
0.6776 27.0 8127 0.7671 0.2658 0.0983 0.2296 0.2319 19.35
0.6776 28.0 8428 0.7672 0.2661 0.0998 0.2311 0.2328 19.35
0.6891 29.0 8729 0.7668 0.267 0.1008 0.2321 0.2335 19.35
0.6772 30.0 9030 0.7668 0.267 0.1008 0.2321 0.2335 19.36

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0