|
--- |
|
license: openrail |
|
library_name: peft |
|
tags: |
|
- generated_from_trainer |
|
base_model: VietAI/envit5-translation |
|
metrics: |
|
- bleu |
|
model-index: |
|
- name: envit5-MedEV |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# envit5-MedEV |
|
|
|
This model is a fine-tuned version of [VietAI/envit5-translation](https://huggingface.co./VietAI/envit5-translation) on the None dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.0795 |
|
- Bleu: 44.8343 -> 47.903 on MedEV test set |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 32 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- gradient_accumulation_steps: 4 |
|
- total_train_batch_size: 128 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: cosine |
|
- lr_scheduler_warmup_steps: 10 |
|
- num_epochs: 5 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Bleu | |
|
|:-------------:|:------:|:-----:|:---------------:|:-------:| |
|
| 33.2165 | 0.1314 | 700 | 0.5906 | 0.0653 | |
|
| 0.4083 | 0.2628 | 1400 | 0.1096 | 13.8606 | |
|
| 0.114 | 0.3942 | 2100 | 0.0918 | 14.7674 | |
|
| 0.1027 | 0.5256 | 2800 | 0.0890 | 14.9410 | |
|
| 0.0997 | 0.6571 | 3500 | 0.0873 | 15.0741 | |
|
| 0.0973 | 0.7885 | 4200 | 0.0861 | 15.1717 | |
|
| 0.0964 | 0.9199 | 4900 | 0.0852 | 15.2362 | |
|
| 0.0949 | 1.0513 | 5600 | 0.0844 | 15.3131 | |
|
| 0.0947 | 1.1827 | 6300 | 0.0838 | 15.3815 | |
|
| 0.0937 | 1.3141 | 7000 | 0.0832 | 15.5075 | |
|
| 0.0935 | 1.4455 | 7700 | 0.0827 | 15.5932 | |
|
| 0.092 | 1.5769 | 8400 | 0.0822 | 15.6434 | |
|
| 0.0924 | 1.7084 | 9100 | 0.0818 | 15.7233 | |
|
| 0.0915 | 1.8398 | 9800 | 0.0815 | 15.8051 | |
|
| 0.0915 | 1.9712 | 10500 | 0.0812 | 15.8279 | |
|
| 0.0906 | 2.1026 | 11200 | 0.0809 | 15.8559 | |
|
| 0.0904 | 2.2340 | 11900 | 0.0807 | 15.9008 | |
|
| 0.0908 | 2.3654 | 12600 | 0.0805 | 15.8917 | |
|
| 0.0904 | 2.4968 | 13300 | 0.0803 | 15.9352 | |
|
| 0.0895 | 2.6282 | 14000 | 0.0802 | 15.9442 | |
|
| 0.0896 | 2.7597 | 14700 | 0.0800 | 15.9677 | |
|
| 0.0894 | 2.8911 | 15400 | 0.0800 | 15.9459 | |
|
| 0.09 | 3.0225 | 16100 | 0.0799 | 15.9746 | |
|
| 0.0895 | 3.1539 | 16800 | 0.0798 | 16.0154 | |
|
| 0.0892 | 3.2853 | 17500 | 0.0797 | 15.9976 | |
|
| 0.0896 | 3.4167 | 18200 | 0.0797 | 16.0193 | |
|
| 0.0893 | 3.5481 | 18900 | 0.0796 | 16.0179 | |
|
| 0.0888 | 3.6795 | 19600 | 0.0796 | 16.0510 | |
|
| 0.0887 | 3.8110 | 20300 | 0.0796 | 16.0226 | |
|
| 0.0891 | 3.9424 | 21000 | 0.0796 | 16.0277 | |
|
| 0.0892 | 4.0738 | 21700 | 0.0796 | 16.0302 | |
|
| 0.0892 | 4.2052 | 22400 | 0.0795 | 16.0425 | |
|
| 0.0886 | 4.3366 | 23100 | 0.0795 | 16.0452 | |
|
| 0.0889 | 4.4680 | 23800 | 0.0795 | 16.0518 | |
|
| 0.0888 | 4.5994 | 24500 | 0.0795 | 16.0397 | |
|
| 0.0893 | 4.7308 | 25200 | 0.0795 | 16.0450 | |
|
| 0.0889 | 4.8623 | 25900 | 0.0795 | 16.0497 | |
|
| 0.0887 | 4.9937 | 26600 | 0.0795 | 16.0497 | |
|
|
|
|
|
### Framework versions |
|
|
|
- PEFT 0.10.0 |
|
- Transformers 4.40.2 |
|
- Pytorch 2.3.0 |
|
- Datasets 2.19.1 |
|
- Tokenizers 0.19.1 |