flanT5-xl-3.1 / README.md
devvanshhh's picture
Model save
72f4662
metadata
base_model: ybelkada/flan-t5-xl-sharded-bf16
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flanT5-xl-3.1
    results: []

flanT5-xl-3.1

This model is a fine-tuned version of ybelkada/flan-t5-xl-sharded-bf16 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6821
  • Rouge1: 32.9118
  • Rouge2: 24.7369
  • Rougel: 29.6106
  • Rougelsum: 29.8854
  • Gen Len: 10.9379

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 362 4.1919 14.9919 9.1411 12.8299 12.9602 15.8354
19.4803 2.0 724 0.8027 31.2365 23.4015 27.8979 28.0521 10.8913
0.87 3.0 1086 0.7601 32.7524 24.7831 29.4005 29.6329 10.4814
0.87 4.0 1448 0.7359 32.4199 24.3103 29.045 29.3107 10.7391
0.7969 5.0 1810 0.7159 33.081 24.9552 29.7936 30.0534 10.6770
0.7607 6.0 2172 0.7029 32.6081 24.4439 29.3121 29.5849 10.8820
0.7482 7.0 2534 0.6928 32.7673 24.6101 29.5065 29.7823 10.8820
0.7482 8.0 2896 0.6865 32.648 24.3905 29.4374 29.7019 11.0
0.729 9.0 3258 0.6831 32.7058 24.4816 29.4377 29.6886 11.0031
0.7218 10.0 3620 0.6821 32.9118 24.7369 29.6106 29.8854 10.9379

Framework versions

  • Transformers 4.36.0.dev0
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0