whisper-large-v3-turbo-hr-parla / README.md

Update README.md

3247238 verified 2 months ago

4.72 kB

	---
	library_name: transformers
	language:
	- hr
	license: apache-2.0
	base_model: openai/whisper-large-v3
	tags:
	- generated_from_trainer
	metrics:
	- wer
	model-index:
	- name: whisper-large-v3-turbo-hr-parla
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# whisper-large-v3-turbo-hr-parla

	This model is a fine-tuned version of [openai/whisper-large-v3](https://huggingface.co./openai/whisper-large-v3) on the classla/ParlaSpeech-HR dataset and additional 400h
	private dataset, both with augmented additions.


	It achieves the following results on the evaluation set:
	- Loss: 0.0816
	- Wer: 3.52%


	### WER comparsion

	\| Dataset \| Model \| WER (%) \|
	\|---------------------------------\|----------------------------------------------------\|----------\|
	\| google/fleurs hr_hr test \| openai/whisper-base \| 65.99 \|
	\| google/fleurs hr_hr test \| openai/whisper-large-v3-turbo \| 12.73 \|
	\| google/fleurs hr_hr test \| slsolucije/whisper-large-v3-turbo-hr-parla-lora-merged \| 9.93 \|
	\| google/fleurs hr_hr test \| GoranS/whisper-large-v3-turbo-hr-parla \| 8.66 \|
	\| GoranS/stt-croatian_99k_265_2 test \| openai/whisper-large-v3-turbo \| 22.93 \|
	\| GoranS/stt-croatian_99k_265_2 test \| slsolucije/whisper-large-v3-turbo-hr-parla-lora-merged \| 19.02 \|
	\| GoranS/stt-croatian_99k_265_2 test \| GoranS/whisper-large-v3-turbo-hr-parla \| 18.44 \|
	\| GoranS/stt-croatian-sl-31k test \| openai/whisper-large-v3-turbo \| 21.62 \|
	\| GoranS/stt-croatian-sl-31k test \| slsolucije/whisper-large-v3-turbo-hr-parla-lora-merged \| 17.07 \|
	\| GoranS/stt-croatian-sl-31k test \| GoranS/whisper-large-v3-turbo-hr-parla \| 16.97 \|
	\| parla_867k_2483_0.5 test \| openai/whisper-large-v3-turbo \| 10.23 \|
	\| parla_867k_2483_0.5 test \| slsolucije/whisper-large-v3-turbo-hr-parla-lora-merged \| 4.58 \|
	\| parla_867k_2483_0.5 test \| GoranS/whisper-large-v3-turbo-hr-parla \| 3.52 \|


	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 6.25e-06
	- train_batch_size: 64
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 800
	- num_epochs: 2
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer \|
	\|:-------------:\|:------:\|:-----:\|:---------------:\|:------:\|
	\| 0.1485 \| 0.0703 \| 1000 \| 0.1376 \| 0.0610 \|
	\| 0.1399 \| 0.1406 \| 2000 \| 0.1210 \| 0.0545 \|
	\| 0.1311 \| 0.2108 \| 3000 \| 0.1144 \| 0.0529 \|
	\| 0.119 \| 0.2811 \| 4000 \| 0.1058 \| 0.0487 \|
	\| 0.1165 \| 0.3514 \| 5000 \| 0.1067 \| 0.0517 \|
	\| 0.1142 \| 0.4217 \| 6000 \| 0.1007 \| 0.0464 \|
	\| 0.1095 \| 0.4920 \| 7000 \| 0.1019 \| 0.0447 \|
	\| 0.1112 \| 0.5622 \| 8000 \| 0.0974 \| 0.0425 \|
	\| 0.1104 \| 0.6325 \| 9000 \| 0.0971 \| 0.0442 \|
	\| 0.1081 \| 0.7028 \| 10000 \| 0.0943 \| 0.0411 \|
	\| 0.1025 \| 0.7731 \| 11000 \| 0.0905 \| 0.0397 \|
	\| 0.1042 \| 0.8433 \| 12000 \| 0.0930 \| 0.0419 \|
	\| 0.1031 \| 0.9136 \| 13000 \| 0.0923 \| 0.0428 \|
	\| 0.1038 \| 0.9839 \| 14000 \| 0.0894 \| 0.0408 \|
	\| 0.0878 \| 1.0542 \| 15000 \| 0.0902 \| 0.0408 \|
	\| 0.0886 \| 1.1245 \| 16000 \| 0.0869 \| 0.0369 \|
	\| 0.0864 \| 1.1947 \| 17000 \| 0.0861 \| 0.0364 \|
	\| 0.0817 \| 1.2650 \| 18000 \| 0.0867 \| 0.0408 \|
	\| 0.0899 \| 1.3353 \| 19000 \| 0.0852 \| 0.0383 \|
	\| 0.0868 \| 1.4056 \| 20000 \| 0.0846 \| 0.0369 \|
	\| 0.0858 \| 1.4759 \| 21000 \| 0.0844 \| 0.0378 \|
	\| 0.0827 \| 1.5461 \| 22000 \| 0.0845 \| 0.0391 \|
	\| 0.0798 \| 1.6164 \| 23000 \| 0.0846 \| 0.0378 \|
	\| 0.0845 \| 1.6867 \| 24000 \| 0.0833 \| 0.0375 \|
	\| 0.0768 \| 1.7570 \| 25000 \| 0.0840 \| 0.0375 \|
	\| 0.0799 \| 1.8273 \| 26000 \| 0.0837 \| 0.0375 \|
	\| 0.0808 \| 1.8975 \| 27000 \| 0.0825 \| 0.0352 \|
	\| 0.0837 \| 1.9678 \| 28000 \| 0.0816 \| 0.0352 \|


	### Framework versions

	- Transformers 4.46.3
	- Pytorch 2.5.0+cu121
	- Datasets 3.1.0
	- Tokenizers 0.20.3

	---
	library_name: transformers
	language:
	- hr
	license: apache-2.0
	base_model: openai/whisper-large-v3
	tags:
	- generated_from_trainer
	metrics:
	- wer
	model-index:
	- name: whisper-large-v3-turbo-hr-parla
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# whisper-large-v3-turbo-hr-parla

	This model is a fine-tuned version of [openai/whisper-large-v3](https://huggingface.co./openai/whisper-large-v3) on the classla/ParlaSpeech-HR dataset and additional 400h
	private dataset, both with augmented additions.


	It achieves the following results on the evaluation set:
	- Loss: 0.0816
	- Wer: 3.52%


	### WER comparsion

	\| Dataset \| Model \| WER (%) \|
	\|---------------------------------\|----------------------------------------------------\|----------\|
	\| google/fleurs hr_hr test \| openai/whisper-base \| 65.99 \|
	\| google/fleurs hr_hr test \| openai/whisper-large-v3-turbo \| 12.73 \|
	\| google/fleurs hr_hr test \| slsolucije/whisper-large-v3-turbo-hr-parla-lora-merged \| 9.93 \|
	\| google/fleurs hr_hr test \| GoranS/whisper-large-v3-turbo-hr-parla \| 8.66 \|
	\| GoranS/stt-croatian_99k_265_2 test \| openai/whisper-large-v3-turbo \| 22.93 \|
	\| GoranS/stt-croatian_99k_265_2 test \| slsolucije/whisper-large-v3-turbo-hr-parla-lora-merged \| 19.02 \|
	\| GoranS/stt-croatian_99k_265_2 test \| GoranS/whisper-large-v3-turbo-hr-parla \| 18.44 \|
	\| GoranS/stt-croatian-sl-31k test \| openai/whisper-large-v3-turbo \| 21.62 \|
	\| GoranS/stt-croatian-sl-31k test \| slsolucije/whisper-large-v3-turbo-hr-parla-lora-merged \| 17.07 \|
	\| GoranS/stt-croatian-sl-31k test \| GoranS/whisper-large-v3-turbo-hr-parla \| 16.97 \|
	\| parla_867k_2483_0.5 test \| openai/whisper-large-v3-turbo \| 10.23 \|
	\| parla_867k_2483_0.5 test \| slsolucije/whisper-large-v3-turbo-hr-parla-lora-merged \| 4.58 \|
	\| parla_867k_2483_0.5 test \| GoranS/whisper-large-v3-turbo-hr-parla \| 3.52 \|


	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 6.25e-06
	- train_batch_size: 64
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 800
	- num_epochs: 2
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer \|
	\|:-------------:\|:------:\|:-----:\|:---------------:\|:------:\|
	\| 0.1485 \| 0.0703 \| 1000 \| 0.1376 \| 0.0610 \|
	\| 0.1399 \| 0.1406 \| 2000 \| 0.1210 \| 0.0545 \|
	\| 0.1311 \| 0.2108 \| 3000 \| 0.1144 \| 0.0529 \|
	\| 0.119 \| 0.2811 \| 4000 \| 0.1058 \| 0.0487 \|
	\| 0.1165 \| 0.3514 \| 5000 \| 0.1067 \| 0.0517 \|
	\| 0.1142 \| 0.4217 \| 6000 \| 0.1007 \| 0.0464 \|
	\| 0.1095 \| 0.4920 \| 7000 \| 0.1019 \| 0.0447 \|
	\| 0.1112 \| 0.5622 \| 8000 \| 0.0974 \| 0.0425 \|
	\| 0.1104 \| 0.6325 \| 9000 \| 0.0971 \| 0.0442 \|
	\| 0.1081 \| 0.7028 \| 10000 \| 0.0943 \| 0.0411 \|
	\| 0.1025 \| 0.7731 \| 11000 \| 0.0905 \| 0.0397 \|
	\| 0.1042 \| 0.8433 \| 12000 \| 0.0930 \| 0.0419 \|
	\| 0.1031 \| 0.9136 \| 13000 \| 0.0923 \| 0.0428 \|
	\| 0.1038 \| 0.9839 \| 14000 \| 0.0894 \| 0.0408 \|
	\| 0.0878 \| 1.0542 \| 15000 \| 0.0902 \| 0.0408 \|
	\| 0.0886 \| 1.1245 \| 16000 \| 0.0869 \| 0.0369 \|
	\| 0.0864 \| 1.1947 \| 17000 \| 0.0861 \| 0.0364 \|
	\| 0.0817 \| 1.2650 \| 18000 \| 0.0867 \| 0.0408 \|
	\| 0.0899 \| 1.3353 \| 19000 \| 0.0852 \| 0.0383 \|
	\| 0.0868 \| 1.4056 \| 20000 \| 0.0846 \| 0.0369 \|
	\| 0.0858 \| 1.4759 \| 21000 \| 0.0844 \| 0.0378 \|
	\| 0.0827 \| 1.5461 \| 22000 \| 0.0845 \| 0.0391 \|
	\| 0.0798 \| 1.6164 \| 23000 \| 0.0846 \| 0.0378 \|
	\| 0.0845 \| 1.6867 \| 24000 \| 0.0833 \| 0.0375 \|
	\| 0.0768 \| 1.7570 \| 25000 \| 0.0840 \| 0.0375 \|
	\| 0.0799 \| 1.8273 \| 26000 \| 0.0837 \| 0.0375 \|
	\| 0.0808 \| 1.8975 \| 27000 \| 0.0825 \| 0.0352 \|
	\| 0.0837 \| 1.9678 \| 28000 \| 0.0816 \| 0.0352 \|


	### Framework versions

	- Transformers 4.46.3
	- Pytorch 2.5.0+cu121
	- Datasets 3.1.0
	- Tokenizers 0.20.3