Update README.md

6229cf8 6 months ago

4.62 kB

	---
	language:
	- ga
	- en
	license: apache-2.0
	base_model: openai/whisper-large
	tags:
	- generated_from_trainer
	datasets:
	- ymoslem/IWSLT2023-GA-EN
	- ymoslem/FLEURS-GA-EN
	- ymoslem/BitesizeIrish-GA-EN
	- ymoslem/SpokenWords-GA-EN-MTed
	metrics:
	- bleu
	- wer
	model-index:
	- name: Whisper Large GA-EN Speech Translation
	results:
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia
	type: ymoslem/IWSLT2023-GA-EN
	metrics:
	- name: Bleu
	type: bleu
	value: 30.16
	- name: Wer
	type: wer
	value: 69.968482665466
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Whisper Large GA-EN Speech Translation

	This model is a fine-tuned version of [openai/whisper-large](https://huggingface.co./openai/whisper-large) on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia dataset.
	The datasets are augmented in two ways: noise augmentation, and truncating low-amplitude samples.
	The best model checkpoint (this version) based on ChrF is at step 3000, epoch 0.99,
	and it achieves the following results on the evaluation set:
	- Loss: 1.1742
	- Bleu: 30.16
	- Chrf: 50.72
	- Wer: 69.9685

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 16
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 0.03
	- training_steps: 3000
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \| Chrf \| Wer \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-----:\|:-----:\|:--------:\|
	\| 3.1833 \| 0.03 \| 100 \| 2.5169 \| 2.03 \| 16.8 \| 215.5786 \|
	\| 2.7632 \| 0.07 \| 200 \| 2.1827 \| 7.81 \| 24.07 \| 113.1022 \|
	\| 2.5687 \| 0.1 \| 300 \| 2.0746 \| 6.16 \| 24.2 \| 158.8474 \|
	\| 2.5615 \| 0.13 \| 400 \| 1.9379 \| 8.68 \| 26.18 \| 120.8465 \|
	\| 2.4554 \| 0.16 \| 500 \| 1.8932 \| 12.14 \| 28.94 \| 103.1067 \|
	\| 2.3546 \| 0.2 \| 600 \| 1.8734 \| 14.34 \| 29.83 \| 91.5353 \|
	\| 2.2804 \| 0.23 \| 700 \| 1.8075 \| 13.18 \| 33.07 \| 105.5380 \|
	\| 2.1408 \| 0.26 \| 800 \| 1.7034 \| 13.01 \| 33.0 \| 89.4642 \|
	\| 2.0411 \| 0.3 \| 900 \| 1.6556 \| 16.73 \| 34.97 \| 91.4453 \|
	\| 1.7766 \| 0.33 \| 1000 \| 1.6505 \| 17.21 \| 35.54 \| 83.5209 \|
	\| 1.7704 \| 0.36 \| 1100 \| 1.5800 \| 17.54 \| 38.11 \| 77.1724 \|
	\| 1.6537 \| 0.39 \| 1200 \| 1.5684 \| 14.2 \| 35.39 \| 95.6326 \|
	\| 1.4841 \| 0.43 \| 1300 \| 1.4970 \| 22.96 \| 39.35 \| 71.3643 \|
	\| 1.641 \| 0.46 \| 1400 \| 1.4693 \| 16.3 \| 37.69 \| 103.7821 \|
	\| 1.393 \| 0.49 \| 1500 \| 1.3923 \| 27.21 \| 43.87 \| 69.3381 \|
	\| 1.249 \| 0.53 \| 1600 \| 1.3876 \| 23.33 \| 42.26 \| 76.5421 \|
	\| 1.3385 \| 0.56 \| 1700 \| 1.3404 \| 23.86 \| 42.82 \| 75.0563 \|
	\| 1.2537 \| 0.59 \| 1800 \| 1.3226 \| 17.03 \| 41.72 \| 100.1801 \|
	\| 1.2891 \| 0.62 \| 1900 \| 1.2995 \| 27.26 \| 43.62 \| 69.1580 \|
	\| 1.226 \| 0.66 \| 2000 \| 1.2605 \| 30.89 \| 47.34 \| 63.5750 \|
	\| 1.1268 \| 0.69 \| 2100 \| 1.2783 \| 27.43 \| 45.97 \| 67.4921 \|
	\| 1.0007 \| 0.72 \| 2200 \| 1.2521 \| 27.21 \| 47.25 \| 71.0041 \|
	\| 0.9565 \| 0.76 \| 2300 \| 1.2219 \| 31.65 \| 48.07 \| 64.2053 \|
	\| 0.9309 \| 0.79 \| 2400 \| 1.2193 \| 31.4 \| 48.18 \| 64.1603 \|
	\| 0.7923 \| 0.82 \| 2500 \| 1.2099 \| 28.88 \| 48.89 \| 69.7884 \|
	\| 0.8199 \| 0.85 \| 2600 \| 1.1972 \| 29.37 \| 48.07 \| 67.3120 \|
	\| 0.6974 \| 0.89 \| 2700 \| 1.1857 \| 29.7 \| 48.95 \| 70.5988 \|
	\| 0.6736 \| 0.92 \| 2800 \| 1.1884 \| 29.33 \| 48.97 \| 72.7150 \|
	\| 0.6826 \| 0.95 \| 2900 \| 1.1834 \| 30.76 \| 50.11 \| 68.1225 \|
	\| 0.7001 \| 0.99 \| 3000 \| 1.1742 \| 30.16 \| 50.72 \| 69.9685 \|


	### Framework versions

	- Transformers 4.39.3
	- Pytorch 2.0.1+cu118
	- Datasets 2.18.0
	- Tokenizers 0.15.2