Model save

c6e3e11 verified 1 day ago

4.28 kB

	---
	license: apache-2.0
	base_model: openai/whisper-large-v3
	tags:
	- generated_from_trainer
	metrics:
	- wer
	model-index:
	- name: Sep29-Mixat-whisper-lg-3-transliteration-0.1trainasval
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Sep29-Mixat-whisper-lg-3-transliteration-0.1trainasval

	This model is a fine-tuned version of [openai/whisper-large-v3](https://huggingface.co./openai/whisper-large-v3) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.7875
	- Wer: 39.7972

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 16
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 100
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer \|
	\|:-------------:\|:-------:\|:----:\|:---------------:\|:-------:\|
	\| 0.7619 \| 0.4762 \| 100 \| 0.5335 \| 54.2752 \|
	\| 0.4908 \| 0.9524 \| 200 \| 0.4641 \| 49.9738 \|
	\| 0.3846 \| 1.4286 \| 300 \| 0.4498 \| 43.4342 \|
	\| 0.389 \| 1.9048 \| 400 \| 0.4382 \| 42.3151 \|
	\| 0.2851 \| 2.3810 \| 500 \| 0.4605 \| 42.1927 \|
	\| 0.2723 \| 2.8571 \| 600 \| 0.4651 \| 42.0878 \|
	\| 0.202 \| 3.3333 \| 700 \| 0.4855 \| 40.9862 \|
	\| 0.1731 \| 3.8095 \| 800 \| 0.4809 \| 41.3184 \|
	\| 0.1243 \| 4.2857 \| 900 \| 0.5475 \| 40.6540 \|
	\| 0.0988 \| 4.7619 \| 1000 \| 0.5303 \| 40.7064 \|
	\| 0.0742 \| 5.2381 \| 1100 \| 0.5775 \| 40.6889 \|
	\| 0.0531 \| 5.7143 \| 1200 \| 0.5825 \| 40.5316 \|
	\| 0.0482 \| 6.1905 \| 1300 \| 0.5976 \| 41.3534 \|
	\| 0.0368 \| 6.6667 \| 1400 \| 0.6118 \| 41.0911 \|
	\| 0.0312 \| 7.1429 \| 1500 \| 0.6439 \| 42.0353 \|
	\| 0.0242 \| 7.6190 \| 1600 \| 0.6332 \| 42.0528 \|
	\| 0.0239 \| 8.0952 \| 1700 \| 0.6684 \| 39.3251 \|
	\| 0.018 \| 8.5714 \| 1800 \| 0.6527 \| 42.3326 \|
	\| 0.019 \| 9.0476 \| 1900 \| 0.6736 \| 40.7239 \|
	\| 0.0153 \| 9.5238 \| 2000 \| 0.6701 \| 42.3326 \|
	\| 0.0168 \| 10.0 \| 2100 \| 0.7033 \| 43.6790 \|
	\| 0.0134 \| 10.4762 \| 2200 \| 0.7028 \| 40.2868 \|
	\| 0.0141 \| 10.9524 \| 2300 \| 0.6997 \| 43.9063 \|
	\| 0.0112 \| 11.4286 \| 2400 \| 0.7055 \| 42.1927 \|
	\| 0.0118 \| 11.9048 \| 2500 \| 0.7112 \| 40.4266 \|
	\| 0.0091 \| 12.3810 \| 2600 \| 0.7509 \| 41.5982 \|
	\| 0.0106 \| 12.8571 \| 2700 \| 0.7075 \| 42.7872 \|
	\| 0.0072 \| 13.3333 \| 2800 \| 0.7263 \| 43.3992 \|
	\| 0.0096 \| 13.8095 \| 2900 \| 0.7365 \| 42.5249 \|
	\| 0.0086 \| 14.2857 \| 3000 \| 0.7722 \| 42.0353 \|
	\| 0.0099 \| 14.7619 \| 3100 \| 0.7480 \| 40.9862 \|
	\| 0.0112 \| 15.2381 \| 3200 \| 0.7422 \| 40.9512 \|
	\| 0.0076 \| 15.7143 \| 3300 \| 0.7749 \| 41.3709 \|
	\| 0.0087 \| 16.1905 \| 3400 \| 0.7505 \| 39.9545 \|
	\| 0.0073 \| 16.6667 \| 3500 \| 0.7583 \| 41.8780 \|
	\| 0.0072 \| 17.1429 \| 3600 \| 0.7541 \| 41.0911 \|
	\| 0.0063 \| 17.6190 \| 3700 \| 0.7516 \| 40.9337 \|
	\| 0.0074 \| 18.0952 \| 3800 \| 0.7798 \| 41.2135 \|
	\| 0.0067 \| 18.5714 \| 3900 \| 0.7780 \| 40.4791 \|
	\| 0.0078 \| 19.0476 \| 4000 \| 0.7596 \| 41.2660 \|
	\| 0.0061 \| 19.5238 \| 4100 \| 0.7660 \| 39.6048 \|
	\| 0.0071 \| 20.0 \| 4200 \| 0.7699 \| 40.9862 \|
	\| 0.0045 \| 20.4762 \| 4300 \| 0.7855 \| 41.4583 \|
	\| 0.0055 \| 20.9524 \| 4400 \| 0.7875 \| 39.7972 \|


	### Framework versions

	- Transformers 4.43.4
	- Pytorch 2.4.1
	- Datasets 3.0.0
	- Tokenizers 0.19.1

	---
	license: apache-2.0
	base_model: openai/whisper-large-v3
	tags:
	- generated_from_trainer
	metrics:
	- wer
	model-index:
	- name: Sep29-Mixat-whisper-lg-3-transliteration-0.1trainasval
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Sep29-Mixat-whisper-lg-3-transliteration-0.1trainasval

	This model is a fine-tuned version of [openai/whisper-large-v3](https://huggingface.co./openai/whisper-large-v3) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.7875
	- Wer: 39.7972

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 16
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 100
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer \|
	\|:-------------:\|:-------:\|:----:\|:---------------:\|:-------:\|
	\| 0.7619 \| 0.4762 \| 100 \| 0.5335 \| 54.2752 \|
	\| 0.4908 \| 0.9524 \| 200 \| 0.4641 \| 49.9738 \|
	\| 0.3846 \| 1.4286 \| 300 \| 0.4498 \| 43.4342 \|
	\| 0.389 \| 1.9048 \| 400 \| 0.4382 \| 42.3151 \|
	\| 0.2851 \| 2.3810 \| 500 \| 0.4605 \| 42.1927 \|
	\| 0.2723 \| 2.8571 \| 600 \| 0.4651 \| 42.0878 \|
	\| 0.202 \| 3.3333 \| 700 \| 0.4855 \| 40.9862 \|
	\| 0.1731 \| 3.8095 \| 800 \| 0.4809 \| 41.3184 \|
	\| 0.1243 \| 4.2857 \| 900 \| 0.5475 \| 40.6540 \|
	\| 0.0988 \| 4.7619 \| 1000 \| 0.5303 \| 40.7064 \|
	\| 0.0742 \| 5.2381 \| 1100 \| 0.5775 \| 40.6889 \|
	\| 0.0531 \| 5.7143 \| 1200 \| 0.5825 \| 40.5316 \|
	\| 0.0482 \| 6.1905 \| 1300 \| 0.5976 \| 41.3534 \|
	\| 0.0368 \| 6.6667 \| 1400 \| 0.6118 \| 41.0911 \|
	\| 0.0312 \| 7.1429 \| 1500 \| 0.6439 \| 42.0353 \|
	\| 0.0242 \| 7.6190 \| 1600 \| 0.6332 \| 42.0528 \|
	\| 0.0239 \| 8.0952 \| 1700 \| 0.6684 \| 39.3251 \|
	\| 0.018 \| 8.5714 \| 1800 \| 0.6527 \| 42.3326 \|
	\| 0.019 \| 9.0476 \| 1900 \| 0.6736 \| 40.7239 \|
	\| 0.0153 \| 9.5238 \| 2000 \| 0.6701 \| 42.3326 \|
	\| 0.0168 \| 10.0 \| 2100 \| 0.7033 \| 43.6790 \|
	\| 0.0134 \| 10.4762 \| 2200 \| 0.7028 \| 40.2868 \|
	\| 0.0141 \| 10.9524 \| 2300 \| 0.6997 \| 43.9063 \|
	\| 0.0112 \| 11.4286 \| 2400 \| 0.7055 \| 42.1927 \|
	\| 0.0118 \| 11.9048 \| 2500 \| 0.7112 \| 40.4266 \|
	\| 0.0091 \| 12.3810 \| 2600 \| 0.7509 \| 41.5982 \|
	\| 0.0106 \| 12.8571 \| 2700 \| 0.7075 \| 42.7872 \|
	\| 0.0072 \| 13.3333 \| 2800 \| 0.7263 \| 43.3992 \|
	\| 0.0096 \| 13.8095 \| 2900 \| 0.7365 \| 42.5249 \|
	\| 0.0086 \| 14.2857 \| 3000 \| 0.7722 \| 42.0353 \|
	\| 0.0099 \| 14.7619 \| 3100 \| 0.7480 \| 40.9862 \|
	\| 0.0112 \| 15.2381 \| 3200 \| 0.7422 \| 40.9512 \|
	\| 0.0076 \| 15.7143 \| 3300 \| 0.7749 \| 41.3709 \|
	\| 0.0087 \| 16.1905 \| 3400 \| 0.7505 \| 39.9545 \|
	\| 0.0073 \| 16.6667 \| 3500 \| 0.7583 \| 41.8780 \|
	\| 0.0072 \| 17.1429 \| 3600 \| 0.7541 \| 41.0911 \|
	\| 0.0063 \| 17.6190 \| 3700 \| 0.7516 \| 40.9337 \|
	\| 0.0074 \| 18.0952 \| 3800 \| 0.7798 \| 41.2135 \|
	\| 0.0067 \| 18.5714 \| 3900 \| 0.7780 \| 40.4791 \|
	\| 0.0078 \| 19.0476 \| 4000 \| 0.7596 \| 41.2660 \|
	\| 0.0061 \| 19.5238 \| 4100 \| 0.7660 \| 39.6048 \|
	\| 0.0071 \| 20.0 \| 4200 \| 0.7699 \| 40.9862 \|
	\| 0.0045 \| 20.4762 \| 4300 \| 0.7855 \| 41.4583 \|
	\| 0.0055 \| 20.9524 \| 4400 \| 0.7875 \| 39.7972 \|


	### Framework versions

	- Transformers 4.43.4
	- Pytorch 2.4.1
	- Datasets 3.0.0
	- Tokenizers 0.19.1