HamzaSidhu786
/

mt5-small-finetuned-govreport-summarization

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

mt5-small-finetuned-govreport-summarization / README.md

HamzaSidhu786's picture

End of training

4787ab6 verified about 2 months ago

|

No virus

3.33 kB

	---
	base_model: google/mt5-small
	tags:
	- generated_from_trainer
	datasets:
	- govreport-summarization
	metrics:
	- rouge
	model-index:
	- name: mt5-small-finetuned-govreport-summarization
	results:
	- task:
	name: Sequence-to-sequence Language Modeling
	type: text2text-generation
	dataset:
	name: govreport-summarization
	type: govreport-summarization
	config: document
	split: train
	args: document
	metrics:
	- name: Rouge1
	type: rouge
	value: 5.4727
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mt5-small-finetuned-govreport-summarization

	This model is a fine-tuned version of [google/mt5-small](https://huggingface.co./google/mt5-small) on the govreport-summarization dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.9193
	- Rouge1: 5.4727
	- Rouge2: 1.8064
	- Rougel: 4.7904
	- Rougelsum: 5.1785

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5.6e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 16

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|
	\| 8.1803 \| 1.0 \| 225 \| 3.4063 \| 4.8262 \| 1.0677 \| 4.1029 \| 4.6438 \|
	\| 4.1012 \| 2.0 \| 450 \| 3.2004 \| 4.888 \| 1.2529 \| 4.0737 \| 4.6698 \|
	\| 3.8386 \| 3.0 \| 675 \| 3.1341 \| 5.0027 \| 1.1715 \| 4.1397 \| 4.7616 \|
	\| 3.6986 \| 4.0 \| 900 \| 3.0698 \| 5.3287 \| 1.6223 \| 4.6697 \| 5.0159 \|
	\| 3.6007 \| 5.0 \| 1125 \| 3.0346 \| 5.5318 \| 1.7741 \| 4.8195 \| 5.2351 \|
	\| 3.5376 \| 6.0 \| 1350 \| 3.0039 \| 4.5345 \| 1.3055 \| 4.0118 \| 4.3259 \|
	\| 3.4794 \| 7.0 \| 1575 \| 2.9845 \| 4.755 \| 1.5096 \| 4.2156 \| 4.5376 \|
	\| 3.4373 \| 8.0 \| 1800 \| 2.9699 \| 4.6843 \| 1.409 \| 4.0942 \| 4.4492 \|
	\| 3.4007 \| 9.0 \| 2025 \| 2.9569 \| 5.5517 \| 1.8103 \| 4.8226 \| 5.2639 \|
	\| 3.3788 \| 10.0 \| 2250 \| 2.9415 \| 5.4873 \| 1.8689 \| 4.8027 \| 5.2162 \|
	\| 3.3549 \| 11.0 \| 2475 \| 2.9429 \| 5.3814 \| 1.7672 \| 4.7337 \| 5.1079 \|
	\| 3.3386 \| 12.0 \| 2700 \| 2.9338 \| 5.4238 \| 1.7718 \| 4.7339 \| 5.1216 \|
	\| 3.3195 \| 13.0 \| 2925 \| 2.9224 \| 5.4666 \| 1.8941 \| 4.79 \| 5.1824 \|
	\| 3.311 \| 14.0 \| 3150 \| 2.9223 \| 5.4197 \| 1.7975 \| 4.7752 \| 5.1176 \|
	\| 3.3027 \| 15.0 \| 3375 \| 2.9202 \| 5.494 \| 1.8446 \| 4.7876 \| 5.1981 \|
	\| 3.2961 \| 16.0 \| 3600 \| 2.9193 \| 5.4727 \| 1.8064 \| 4.7904 \| 5.1785 \|


	### Framework versions

	- Transformers 4.42.3
	- Pytorch 2.3.1+cu121
	- Datasets 2.20.0
	- Tokenizers 0.19.1