Update README.md

92f313d 11 months ago

4.44 kB

	---
	license: apache-2.0
	base_model: google/flan-t5-small
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: fine-tune-llama2
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# fine-tune-flan-t5

	This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co./google/flan-t5-small) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.7668
	- Rouge1: 0.267
	- Rouge2: 0.1008
	- Rougel: 0.2321
	- Rougelsum: 0.2335
	- Gen Len: 19.36

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 30

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|:-------:\|
	\| No log \| 1.0 \| 301 \| 0.8071 \| 0.2304 \| 0.0799 \| 0.1958 \| 0.1966 \| 19.42 \|
	\| 0.8375 \| 2.0 \| 602 \| 0.8009 \| 0.2368 \| 0.0763 \| 0.1966 \| 0.1978 \| 19.41 \|
	\| 0.8375 \| 3.0 \| 903 \| 0.7935 \| 0.2496 \| 0.08 \| 0.2114 \| 0.2119 \| 19.57 \|
	\| 0.8146 \| 4.0 \| 1204 \| 0.7900 \| 0.2556 \| 0.0874 \| 0.217 \| 0.2178 \| 19.55 \|
	\| 0.7923 \| 5.0 \| 1505 \| 0.7893 \| 0.2565 \| 0.089 \| 0.2145 \| 0.2157 \| 19.46 \|
	\| 0.7923 \| 6.0 \| 1806 \| 0.7846 \| 0.2512 \| 0.087 \| 0.2134 \| 0.2139 \| 19.35 \|
	\| 0.7801 \| 7.0 \| 2107 \| 0.7845 \| 0.2499 \| 0.082 \| 0.2082 \| 0.209 \| 19.34 \|
	\| 0.7801 \| 8.0 \| 2408 \| 0.7792 \| 0.2464 \| 0.0822 \| 0.2102 \| 0.2105 \| 19.52 \|
	\| 0.7589 \| 9.0 \| 2709 \| 0.7769 \| 0.2433 \| 0.0774 \| 0.2049 \| 0.206 \| 19.32 \|
	\| 0.7457 \| 10.0 \| 3010 \| 0.7746 \| 0.2459 \| 0.0767 \| 0.2073 \| 0.2082 \| 19.32 \|
	\| 0.7457 \| 11.0 \| 3311 \| 0.7740 \| 0.245 \| 0.0817 \| 0.2086 \| 0.2087 \| 19.32 \|
	\| 0.7495 \| 12.0 \| 3612 \| 0.7736 \| 0.2491 \| 0.0824 \| 0.2138 \| 0.2144 \| 19.31 \|
	\| 0.7495 \| 13.0 \| 3913 \| 0.7716 \| 0.2541 \| 0.0876 \| 0.2172 \| 0.2182 \| 19.29 \|
	\| 0.7341 \| 14.0 \| 4214 \| 0.7694 \| 0.2654 \| 0.096 \| 0.2274 \| 0.2289 \| 19.34 \|
	\| 0.711 \| 15.0 \| 4515 \| 0.7677 \| 0.2672 \| 0.1054 \| 0.23 \| 0.2312 \| 19.35 \|
	\| 0.711 \| 16.0 \| 4816 \| 0.7698 \| 0.2774 \| 0.106 \| 0.2395 \| 0.2412 \| 19.34 \|
	\| 0.7158 \| 17.0 \| 5117 \| 0.7693 \| 0.269 \| 0.0973 \| 0.2301 \| 0.2317 \| 19.5 \|
	\| 0.7158 \| 18.0 \| 5418 \| 0.7683 \| 0.2696 \| 0.1062 \| 0.233 \| 0.2342 \| 19.3 \|
	\| 0.7016 \| 19.0 \| 5719 \| 0.7694 \| 0.2601 \| 0.0971 \| 0.2258 \| 0.2271 \| 19.29 \|
	\| 0.6977 \| 20.0 \| 6020 \| 0.7687 \| 0.269 \| 0.0999 \| 0.2328 \| 0.2341 \| 19.33 \|
	\| 0.6977 \| 21.0 \| 6321 \| 0.7676 \| 0.2658 \| 0.099 \| 0.2297 \| 0.2304 \| 19.32 \|
	\| 0.7028 \| 22.0 \| 6622 \| 0.7674 \| 0.2654 \| 0.0986 \| 0.2299 \| 0.2308 \| 19.29 \|
	\| 0.7028 \| 23.0 \| 6923 \| 0.7676 \| 0.2666 \| 0.0993 \| 0.231 \| 0.2328 \| 19.34 \|
	\| 0.6698 \| 24.0 \| 7224 \| 0.7667 \| 0.2658 \| 0.0989 \| 0.2296 \| 0.2314 \| 19.35 \|
	\| 0.6956 \| 25.0 \| 7525 \| 0.7670 \| 0.2669 \| 0.0992 \| 0.2295 \| 0.2308 \| 19.35 \|
	\| 0.6956 \| 26.0 \| 7826 \| 0.7669 \| 0.2618 \| 0.0966 \| 0.2259 \| 0.2269 \| 19.31 \|
	\| 0.6776 \| 27.0 \| 8127 \| 0.7671 \| 0.2658 \| 0.0983 \| 0.2296 \| 0.2319 \| 19.35 \|
	\| 0.6776 \| 28.0 \| 8428 \| 0.7672 \| 0.2661 \| 0.0998 \| 0.2311 \| 0.2328 \| 19.35 \|
	\| 0.6891 \| 29.0 \| 8729 \| 0.7668 \| 0.267 \| 0.1008 \| 0.2321 \| 0.2335 \| 19.35 \|
	\| 0.6772 \| 30.0 \| 9030 \| 0.7668 \| 0.267 \| 0.1008 \| 0.2321 \| 0.2335 \| 19.36 \|


	### Framework versions

	- Transformers 4.35.2
	- Pytorch 2.1.0+cu118
	- Datasets 2.15.0
	- Tokenizers 0.15.0