whisper-large-v3-chichewa
This model is a fine-tuned version of openai/whisper-large-v3 on a Chichewa dataset. The dataset details will be provided at a later stage. It achieves the following results on the evaluation set:
- Loss: 3.1427
- Wer: 101.3340
Model description
More information needed
Intended uses & limitations
The model is fine-tuned to perfom transcription of Chichewa. There are several versions of this model, please refer to the usage example notebook to see how to find the commit which has model with best WER.
Alternatively, you can explore the Files and Versions tab, go to commits and find the commit with best WER (around 61).
Also, its worth noting that the model repo doesnt have a tokenizer.json
, as a results the model needs to be loaded using WhisperProcessor
instead of AutoModel or other modules in Transformer.
Source of Funding for this Work
The dataset used to fine-tune this model, as well as the compute resources, were provided by Opportunity International.
This was part of a project in Malawi aimed at supporting the deployment of an LLM-based chatbot for agriculture, with the capability to handle voice interactions in the local language, Chichewa.
A total of 30 hours was collected for this dataset but due to data quality issues, only 25 hours was used.
About 30 minutes was also removed to be used as hold-out for further model evaluation.
Training and evaluation data
More information needed
Training procedure
Most of the training for this model involved trying to varying speech dataset sizes (5 hours, 10 hours up to 24 hours). As such, the different model commits represent different data sizes. More details will be added to each model commit.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 32
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 10000
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Wer |
---|---|---|---|---|
0.9229 | 7.0423 | 1000 | 2.0780 | 86.3539 |
0.1427 | 14.0845 | 2000 | 2.5560 | 83.5493 |
0.087 | 21.1268 | 3000 | 2.6909 | 80.8704 |
0.0742 | 28.1690 | 4000 | 2.8007 | 81.8982 |
0.065 | 35.2113 | 5000 | 2.8871 | 84.3639 |
0.0627 | 42.2535 | 6000 | 2.9465 | 84.5334 |
0.0586 | 49.2958 | 7000 | 3.0451 | 114.1600 |
0.063 | 56.3380 | 8000 | 3.0983 | 82.6964 |
0.0588 | 63.3803 | 9000 | 3.1352 | 81.0180 |
0.0591 | 70.4225 | 10000 | 3.1427 | 101.3340 |
Framework versions
- Transformers 4.45.1
- Pytorch 2.0.1
- Datasets 3.0.1
- Tokenizers 0.20.0
- Downloads last month
- 146
Model tree for dmatekenya/whisper-large-v3-chichewa
Base model
openai/whisper-large-v3