Angelectronic
/

envit5-MedEV

Generated from Trainer

Model card Files Files and versions Metrics Training metrics Community

envit5-MedEV / README.md

Angelectronic's picture

Update README.md

8c9e67d verified 6 months ago

|

history blame contribute delete

3.85 kB

	---
	license: openrail
	library_name: peft
	tags:
	- generated_from_trainer
	base_model: VietAI/envit5-translation
	metrics:
	- bleu
	model-index:
	- name: envit5-MedEV
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# envit5-MedEV

	This model is a fine-tuned version of [VietAI/envit5-translation](https://huggingface.co./VietAI/envit5-translation) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.0795
	- Bleu: 44.8343 -> 47.903 on MedEV test set

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 32
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 10
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \|
	\|:-------------:\|:------:\|:-----:\|:---------------:\|:-------:\|
	\| 33.2165 \| 0.1314 \| 700 \| 0.5906 \| 0.0653 \|
	\| 0.4083 \| 0.2628 \| 1400 \| 0.1096 \| 13.8606 \|
	\| 0.114 \| 0.3942 \| 2100 \| 0.0918 \| 14.7674 \|
	\| 0.1027 \| 0.5256 \| 2800 \| 0.0890 \| 14.9410 \|
	\| 0.0997 \| 0.6571 \| 3500 \| 0.0873 \| 15.0741 \|
	\| 0.0973 \| 0.7885 \| 4200 \| 0.0861 \| 15.1717 \|
	\| 0.0964 \| 0.9199 \| 4900 \| 0.0852 \| 15.2362 \|
	\| 0.0949 \| 1.0513 \| 5600 \| 0.0844 \| 15.3131 \|
	\| 0.0947 \| 1.1827 \| 6300 \| 0.0838 \| 15.3815 \|
	\| 0.0937 \| 1.3141 \| 7000 \| 0.0832 \| 15.5075 \|
	\| 0.0935 \| 1.4455 \| 7700 \| 0.0827 \| 15.5932 \|
	\| 0.092 \| 1.5769 \| 8400 \| 0.0822 \| 15.6434 \|
	\| 0.0924 \| 1.7084 \| 9100 \| 0.0818 \| 15.7233 \|
	\| 0.0915 \| 1.8398 \| 9800 \| 0.0815 \| 15.8051 \|
	\| 0.0915 \| 1.9712 \| 10500 \| 0.0812 \| 15.8279 \|
	\| 0.0906 \| 2.1026 \| 11200 \| 0.0809 \| 15.8559 \|
	\| 0.0904 \| 2.2340 \| 11900 \| 0.0807 \| 15.9008 \|
	\| 0.0908 \| 2.3654 \| 12600 \| 0.0805 \| 15.8917 \|
	\| 0.0904 \| 2.4968 \| 13300 \| 0.0803 \| 15.9352 \|
	\| 0.0895 \| 2.6282 \| 14000 \| 0.0802 \| 15.9442 \|
	\| 0.0896 \| 2.7597 \| 14700 \| 0.0800 \| 15.9677 \|
	\| 0.0894 \| 2.8911 \| 15400 \| 0.0800 \| 15.9459 \|
	\| 0.09 \| 3.0225 \| 16100 \| 0.0799 \| 15.9746 \|
	\| 0.0895 \| 3.1539 \| 16800 \| 0.0798 \| 16.0154 \|
	\| 0.0892 \| 3.2853 \| 17500 \| 0.0797 \| 15.9976 \|
	\| 0.0896 \| 3.4167 \| 18200 \| 0.0797 \| 16.0193 \|
	\| 0.0893 \| 3.5481 \| 18900 \| 0.0796 \| 16.0179 \|
	\| 0.0888 \| 3.6795 \| 19600 \| 0.0796 \| 16.0510 \|
	\| 0.0887 \| 3.8110 \| 20300 \| 0.0796 \| 16.0226 \|
	\| 0.0891 \| 3.9424 \| 21000 \| 0.0796 \| 16.0277 \|
	\| 0.0892 \| 4.0738 \| 21700 \| 0.0796 \| 16.0302 \|
	\| 0.0892 \| 4.2052 \| 22400 \| 0.0795 \| 16.0425 \|
	\| 0.0886 \| 4.3366 \| 23100 \| 0.0795 \| 16.0452 \|
	\| 0.0889 \| 4.4680 \| 23800 \| 0.0795 \| 16.0518 \|
	\| 0.0888 \| 4.5994 \| 24500 \| 0.0795 \| 16.0397 \|
	\| 0.0893 \| 4.7308 \| 25200 \| 0.0795 \| 16.0450 \|
	\| 0.0889 \| 4.8623 \| 25900 \| 0.0795 \| 16.0497 \|
	\| 0.0887 \| 4.9937 \| 26600 \| 0.0795 \| 16.0497 \|


	### Framework versions

	- PEFT 0.10.0
	- Transformers 4.40.2
	- Pytorch 2.3.0
	- Datasets 2.19.1
	- Tokenizers 0.19.1