metadata

base_model: google/flan-t5-base
library_name: peft
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: results
    results: []
pipeline_tag: text-generation

results

This model is a fine-tuned version of google/flan-t5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 2.6519

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 3
total_train_batch_size: 12
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 16
training_steps: 1698

Training results

Training Loss	Epoch	Step	Validation Loss
3.2568	0.59	50	2.9764
3.2186	1.18	100	2.9349
3.1884	1.76	150	2.8820
3.1448	2.35	200	2.8404
3.1166	2.94	250	2.8120
3.0742	3.53	300	2.7899
3.0662	4.12	350	2.7724
3.0379	4.71	400	2.7578
3.0301	5.29	450	2.7457
3.0071	5.88	500	2.7352
3.0084	6.47	550	2.7259
2.9632	7.06	600	2.7177
2.9706	7.65	650	2.7104
2.9543	8.24	700	2.7037
2.9573	8.82	750	2.6979
2.9663	9.41	800	2.6928
2.9243	10.0	850	2.6877
2.9451	10.59	900	2.6832
2.9027	11.18	950	2.6790
2.9255	11.76	1000	2.6754
2.916	12.35	1050	2.6719
2.9155	12.94	1100	2.6688
2.9223	13.53	1150	2.6659
2.9141	14.12	1200	2.6635
2.8931	14.71	1250	2.6612
2.8988	15.29	1300	2.6590
2.8986	15.88	1350	2.6573
2.8998	16.47	1400	2.6558
2.9004	17.06	1450	2.6546
2.9036	17.65	1500	2.6535
2.885	18.24	1550	2.6528
2.8994	18.82	1600	2.6522
2.8971	19.41	1650	2.6519

Framework versions

PEFT 0.8.2
Transformers 4.38.1
Pytorch 2.3.0+cu121
Datasets 2.17.0
Tokenizers 0.15.2