|
--- |
|
license: cc-by-nc-4.0 |
|
language: |
|
- hyw |
|
datasets: |
|
- mozilla-foundation/common_voice_16_1 |
|
- google/fleurs |
|
- ReRooted |
|
pipeline_tag: automatic-speech-recognition |
|
tags: |
|
- audio-to-audio |
|
- text-to-speech |
|
- seamless_communication |
|
--- |
|
|
|
# SeamlessM4T v2 ASR for Western Armenian |
|
|
|
This model is a fine-tuned version of the [facebook/seamless-m4t-v2-large](https://huggingface.co./facebook/seamless-m4t-v2-large). Initially, it was fine-tuned on the Common Voice 16.1 and Google Fleurs datasets. Subsequently, it was further fine-tuned on the [ReRooted](https://github.com/jhdeov/ReRooted-ArmenianCorpus) corpus. |
|
The model achieves the following results on the test sets: |
|
- CV_wer: 0.308 |
|
- CV_cer: 0.07 |
|
- GF_wer: 0.311 |
|
- GF_cer: 0.094 |
|
|
|
After fine-tuning on Western Armenian data, the model occasionally translates Eastern Armenian speech into Western Armenian. |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 1e-6 |
|
- train_batch_size: 4 |
|
- eval_batch_size: 1 |
|
- seed: 43 |
|
- optimizer: Adam with betas=(0.9, 0.98) and epsilon=1e-08 |
|
- lr_scheduler_type: MyleLR |
|
- lr_scheduler_warmup_steps: 100 |
|
|
|
### Framework versions |
|
- Pytorch 2.1.1 |
|
- fairseq2==0.2.0 |