metadata

language:
  - ga
  - en
license: apache-2.0
base_model: openai/whisper-medium
tags:
  - generated_from_trainer
datasets:
  - ymoslem/IWSLT2023-GA-EN
  - ymoslem/FLEURS-GA-EN
  - ymoslem/BitesizeIrish-GA-EN
  - ymoslem/SpokenWords-GA-EN-MTed
metrics:
  - bleu
  - wer
model-index:
  - name: Whisper Medium GA-EN Speech Translation
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia
          type: ymoslem/IWSLT2023-GA-EN
        metrics:
          - name: Bleu
            type: bleu
            value: 32.14
          - name: Wer
            type: wer
            value: 65.96127870328681

Whisper Medium GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-medium on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia dataset. The best model checkpoint (this version) is at step 1400, epoch 1.84 (4 x 0.46), and it achieves the following results on the evaluation set:

Loss: 1.0240
Bleu: 33.55
Chrf: 50.95
Wer: 60.1981

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 0.03
training_steps: 2000
mixed_precision_training: Native AMP

Hardware

4 x A40 48GB VRAM, with batch size 4 per machine (total: 16)

Training results

Training Loss	Epoch	Step	Bleu	Chrf	Validation Loss	Wer
2.9468	0.03	100	4.72	20.55	2.2829	120.6213
2.5074	0.07	200	7.81	25.23	2.0136	114.8131
2.2406	0.1	300	11.24	29.39	1.8224	95.9928
2.2466	0.13	400	16.01	34.73	1.6530	83.4309
2.0276	0.16	500	16.69	34.76	1.5344	94.2368
1.8429	0.2	600	21.37	37.48	1.4923	78.5682
1.7621	0.23	700	23.4	40.89	1.3666	74.3359
1.5629	0.26	800	24.76	44.63	1.2876	76.6321
1.5458	0.3	900	25.81	44.59	1.2178	72.6249
1.2971	0.33	1000	27.63	46.91	1.1823	70.2837
1.3852	0.36	1100	27.18	46.16	1.2303	70.6889
1.309	0.39	1200	27.65	47.41	1.1573	72.0396
1.1818	0.43	1300	31.17	48.36	1.1304	61.6389
1.2711	0.46	1400	33.55	50.95	1.0839	60.1981
1.1305	0.49	1500	30.37	50.78	1.0718	68.6628
1.0544	0.53	1600	26.98	48.1	1.1109	73.7506
1.125	0.56	1700	30.76	50.19	1.0709	61.7740
1.1348	0.59	1800	33.71	50.6	1.0530	59.9280
1.14	0.62	1900	31.45	50.16	1.0392	66.9068
1.1059	0.66	2000	32.14	50.84	1.0240	65.9613

Framework versions

Transformers 4.39.3
Pytorch 2.0.1+cu118
Datasets 2.18.0
Tokenizers 0.15.2