long-t5-tglobal-xl-qmsum-wip

⚠️ warning - this is a work in progress ⚠️

This model is a fine-tuned version of google/long-t5-tglobal-xl on the pszemraj/qmsum-cleaned dataset.

Refer to the dataset card for details but this model was trained with the task/prompt prefixes at the start of input which means that inference should be run in a similar fashion.
an example of how to run inference is in the Colab notebook linked above.

It achieves the following results on the evaluation set:

Training procedure

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.5376	1.0	99	2.0104	35.8802	11.4595	23.6656	31.49	77.77
1.499	2.0	198	2.0358	35.1265	11.549	23.1062	30.8815	88.88
1.5034	3.0	297	2.0505	35.3881	11.509	23.1543	31.3295	80.8