flan-t5-small-sql / README.md
kyryl-opens-ml's picture
End of training
a48fd04 verified
|
raw
history blame
2.22 kB
metadata
license: apache-2.0
library_name: peft
tags:
  - generated_from_trainer
base_model: google/flan-t5-small
model-index:
  - name: flan-t5-small-sql
    results: []

flan-t5-small-sql

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3584

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 128
  • total_eval_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1000.0

Training results

Training Loss Epoch Step Validation Loss
0.3349 62.5 500 0.2107
0.1147 125.0 1000 0.2410
0.071 187.5 1500 0.2687
0.0502 250.0 2000 0.2901
0.0373 312.5 2500 0.3033
0.0301 375.0 3000 0.3141
0.025 437.5 3500 0.3235
0.0212 500.0 4000 0.3312
0.0187 562.5 4500 0.3404
0.017 625.0 5000 0.3371
0.0148 687.5 5500 0.3466
0.0139 750.0 6000 0.3480
0.0124 812.5 6500 0.3552
0.0118 875.0 7000 0.3594
0.0112 937.5 7500 0.3581
0.0106 1000.0 8000 0.3584

Framework versions

  • PEFT 0.7.1
  • Transformers 4.38.0
  • Pytorch 2.1.2+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2