flan-xl-gen5 / README.md
devvanshhh's picture
Model save
f8e9955
|
raw
history blame
2.01 kB
metadata
base_model: ybelkada/flan-t5-xl-sharded-bf16
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: flan-xl-gen5
    results: []

flan-xl-gen5

This model is a fine-tuned version of ybelkada/flan-t5-xl-sharded-bf16 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7669
  • Rouge1: 24.6538
  • Rouge2: 17.821
  • Rougel: 21.5884
  • Rougelsum: 21.9045
  • Gen Len: 13.0515

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
No log 1.0 328 24.8987 29.9366 22.9687 26.9975 27.1774 11.1203
25.467 2.0 656 1.3504 51.142 49.8705 51.1588 51.1528 0.0
25.467 3.0 984 0.8221 19.5594 12.7325 16.4586 16.7605 14.9278
1.8759 4.0 1312 0.7783 21.8348 14.9645 18.7764 19.0709 14.1100
0.8715 5.0 1640 0.7669 24.6538 17.821 21.5884 21.9045 13.0515

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0