metadata

base_model: google/flan-t5-large
datasets:
  - samsum
library_name: peft
license: apache-2.0
metrics:
  - rouge
tags:
  - generated_from_trainer
widget:
  - text: Enter some text to summarize

FlanT5Summarization-samsum

This model is a fine-tuned version of google/flan-t5-large on the samsum dataset. It achieves the following results on the evaluation set:

Loss: 0.3001
Rouge1: 0.2788
Rouge2: 0.1310
Rougel: 0.2363
Rougelsum: 0.2369

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 128
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 512
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
1.1072	0.0866	5	0.9165	0.2705	0.1135	0.2226	0.2229
1.1039	0.1732	10	0.9080	0.2709	0.1138	0.2230	0.2234
1.0848	0.2597	15	0.8917	0.2706	0.1137	0.2228	0.2231
1.0706	0.3463	20	0.8654	0.2709	0.1142	0.2232	0.2234
1.0461	0.4329	25	0.8336	0.2706	0.1140	0.2228	0.2232
1.0187	0.5195	30	0.7960	0.2718	0.1145	0.2240	0.2243
0.9774	0.6061	35	0.7532	0.2723	0.1152	0.2250	0.2253
0.9326	0.6926	40	0.7064	0.2726	0.1153	0.2253	0.2257
0.8834	0.7792	45	0.6570	0.2728	0.1160	0.2259	0.2261
0.833	0.8658	50	0.6080	0.2734	0.1161	0.2262	0.2263
0.7871	0.9524	55	0.5614	0.2726	0.1156	0.2260	0.2260
0.735	1.0390	60	0.5180	0.2731	0.1169	0.2262	0.2264
0.6978	1.1255	65	0.4802	0.2736	0.1179	0.2275	0.2276
0.6464	1.2121	70	0.4482	0.2741	0.1188	0.2283	0.2286
0.6175	1.2987	75	0.4222	0.2742	0.1193	0.2291	0.2292
0.5722	1.3853	80	0.4007	0.2740	0.1187	0.2287	0.2287
0.5443	1.4719	85	0.3834	0.2730	0.1180	0.2282	0.2282
0.5203	1.5584	90	0.3692	0.2740	0.1192	0.2293	0.2293
0.4851	1.6450	95	0.3568	0.2744	0.1201	0.2300	0.2302
0.4619	1.7316	100	0.3466	0.2746	0.1201	0.2304	0.2305
0.4484	1.8182	105	0.3379	0.2754	0.1218	0.2314	0.2319
0.4357	1.9048	110	0.3305	0.2766	0.1241	0.2325	0.2330
0.4246	1.9913	115	0.3243	0.2772	0.1254	0.2338	0.2341
0.4074	2.0779	120	0.3190	0.2776	0.1263	0.2343	0.2347
0.3965	2.1645	125	0.3144	0.2775	0.1264	0.2342	0.2345
0.3922	2.2511	130	0.3105	0.2776	0.1266	0.2344	0.2347
0.3861	2.3377	135	0.3073	0.2786	0.1289	0.2357	0.2362
0.382	2.4242	140	0.3048	0.2782	0.1289	0.2354	0.2358
0.3807	2.5108	145	0.3029	0.2787	0.1297	0.2359	0.2364
0.3717	2.5974	150	0.3016	0.2787	0.1303	0.2363	0.2367
0.3708	2.6840	155	0.3008	0.2788	0.1305	0.2363	0.2368
0.372	2.7706	160	0.3003	0.2789	0.1310	0.2365	0.2370
0.3696	2.8571	165	0.3002	0.2788	0.1310	0.2363	0.2369
0.3646	2.9437	170	0.3001	0.2788	0.1310	0.2363	0.2369

Framework versions

PEFT 0.12.0
Transformers 4.43.2
Pytorch 2.1.2
Datasets 2.20.0
Tokenizers 0.19.1