Update README.md

686a5a5 verified 7 months ago

5.41 kB

	---
	language:
	- ga
	- en
	license: apache-2.0
	base_model: openai/whisper-small
	tags:
	- generated_from_trainer
	datasets:
	- ymoslem/IWSLT2023-GA-EN
	- ymoslem/FLEURS-GA-EN
	- ymoslem/BitesizeIrish-GA-EN
	- ymoslem/SpokenWords-GA-EN-MTed
	metrics:
	- bleu
	- wer
	- chrf
	model-index:
	- name: Whisper Small GA-EN Speech Translation
	results:
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: IWSLT-2023, FLEURS, BiteSize, SpokenWords
	type: ymoslem/IWSLT2023-GA-EN
	metrics:
	- name: Bleu
	type: bleu
	value: 26.85
	- name: Wer
	type: wer
	value: 73.52543899144528
	library_name: transformers
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Whisper Small GA-EN Speech Translation

	This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co./openai/whisper-small) on the IWSLT-2023, FLEURS, BiteSize, and SpokenWords datasets.
	The best model checkpoint (this version) based on ChrF is at step 3300, epoch 3.67, and it achieves the following results on the evaluation set:
	- Loss: 1.5823
	- Bleu: 29.81
	- Chrf: 46.50
	- Wer: 66.7267

	The best checkpoint based on BLEU achieves the following results:
	- Loss: 1.5752
	- Bleu: 30.77
	- Chrf: 46.43
	- Wer: 64.6556

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Experiment

	- language=English
	- +more steps

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 32
	- eval_batch_size: 32
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 0.03
	- training_steps: 4000
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Bleu \| Chrf \| Validation Loss \| Wer \|
	\|:-------------:\|:-----:\|:----:\|:-----:\|:-----:\|:---------------:\|:--------:\|
	\| 2.4954 \| 0.11 \| 100 \| 3.7 \| 18.03 \| 2.1286 \| 179.7839 \|
	\| 2.045 \| 0.22 \| 200 \| 12.65 \| 25.53 \| 1.8146 \| 100.9005 \|
	\| 1.7928 \| 0.32 \| 300 \| 13.78 \| 30.2 \| 1.7253 \| 101.9811 \|
	\| 1.6615 \| 0.43 \| 400 \| 15.8 \| 31.88 \| 1.6834 \| 92.5259 \|
	\| 1.4491 \| 0.54 \| 500 \| 15.61 \| 36.27 \| 1.5971 \| 107.3841 \|
	\| 1.2074 \| 0.65 \| 600 \| 19.92 \| 36.31 \| 1.5939 \| 84.3314 \|
	\| 1.2308 \| 0.76 \| 700 \| 20.37 \| 38.72 \| 1.5234 \| 84.8267 \|
	\| 1.107 \| 0.86 \| 800 \| 21.35 \| 37.87 \| 1.5460 \| 82.8906 \|
	\| 0.9491 \| 0.97 \| 900 \| 21.06 \| 40.74 \| 1.5161 \| 82.5754 \|
	\| 0.384 \| 1.08 \| 1000 \| 23.24 \| 41.98 \| 1.4927 \| 82.2152 \|
	\| 0.362 \| 1.19 \| 1100 \| 23.19 \| 42.24 \| 1.5567 \| 80.2792 \|
	\| 0.3756 \| 1.29 \| 1200 \| 27.83 \| 43.8 \| 1.5265 \| 69.2481 \|
	\| 0.3401 \| 1.4 \| 1300 \| 21.79 \| 41.66 \| 1.5522 \| 92.3908 \|
	\| 0.3346 \| 1.51 \| 1400 \| 24.61 \| 42.15 \| 1.5085 \| 75.4615 \|
	\| 0.3101 \| 1.62 \| 1500 \| 26.67 \| 43.41 \| 1.4933 \| 70.7789 \|
	\| 0.3231 \| 1.73 \| 1600 \| 27.95 \| 42.82 \| 1.4979 \| 68.3026 \|
	\| 0.2665 \| 1.83 \| 1700 \| 28.5 \| 43.76 \| 1.4977 \| 68.1225 \|
	\| 0.2704 \| 1.94 \| 1800 \| 28.15 \| 43.87 \| 1.5063 \| 68.8429 \|
	\| 0.0769 \| 2.05 \| 1900 \| 25.76 \| 43.22 \| 1.5162 \| 77.6227 \|
	\| 0.0597 \| 2.16 \| 2000 \| 25.04 \| 43.15 \| 1.5216 \| 79.0635 \|
	\| 0.0743 \| 2.27 \| 2100 \| 27.85 \| 44.43 \| 1.5313 \| 68.3926 \|
	\| 0.0878 \| 2.37 \| 2200 \| 27.54 \| 43.96 \| 1.5495 \| 68.3476 \|
	\| 0.0712 \| 2.48 \| 2300 \| 28.28 \| 44.39 \| 1.5355 \| 65.8712 \|
	\| 0.0789 \| 2.59 \| 2400 \| 28.64 \| 44.75 \| 1.5277 \| 65.7812 \|
	\| 0.073 \| 2.7 \| 2500 \| 29.09 \| 44.65 \| 1.5327 \| 65.7812 \|
	\| 0.073 \| 2.8 \| 2600 \| 25.26 \| 43.44 \| 1.5304 \| 78.2981 \|
	\| 0.0697 \| 2.91 \| 2700 \| 25.71 \| 43.02 \| 1.5460 \| 78.4782 \|
	\| 0.0398 \| 3.02 \| 2800 \| 28.26 \| 44.71 \| 1.5580 \| 72.8501 \|
	\| 0.0302 \| 3.13 \| 2900 \| 30.25 \| 45.46 \| 1.5688 \| 66.1414 \|
	\| 0.0424 \| 3.24 \| 3000 \| 29.88 \| 45.21 \| 1.5693 \| 66.0964 \|
	\| 0.0397 \| 3.34 \| 3100 \| 30.01 \| 45.85 \| 1.5934 \| 65.6911 \|
	\| 0.0346 \| 3.45 \| 3200 \| 30.2 \| 45.8 \| 1.5818 \| 65.8262 \|
	\| 0.032 \| 3.56 \| 3300 \| 29.81 \| 46.5 \| 1.5823 \| 66.7267 \|
	\| 0.0348 \| 3.67 \| 3400 \| 30.77 \| 46.43 \| 1.5752 \| 64.6556 \|
	\| 0.0277 \| 3.78 \| 3500 \| 30.3 \| 46.02 \| 1.5791 \| 64.6105 \|
	\| 0.0364 \| 3.88 \| 3600 \| 29.92 \| 45.38 \| 1.5895 \| 65.0608 \|
	\| 0.0398 \| 3.99 \| 3700 \| 27.79 \| 44.59 \| 1.6167 \| 68.2575 \|
	\| 0.0152 \| 4.1 \| 3800 \| 28.42 \| 44.83 \| 1.6241 \| 67.5822 \|
	\| 0.0201 \| 4.21 \| 3900 \| 29.02 \| 45.11 \| 1.6243 \| 67.4921 \|
	\| 0.0168 \| 4.31 \| 4000 \| 26.85 \| 44.41 \| 1.6195 \| 73.5254 \|


	### Framework versions

	- Transformers 4.39.3
	- Pytorch 2.2.1+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2