File size: 3,054 Bytes
873a17c 128cf5b 973f2a4 873a17c 973f2a4 128cf5b 973f2a4 873a17c 973f2a4 873a17c ed37d85 973f2a4 128cf5b 973f2a4 128cf5b 973f2a4 128cf5b 91ffe3f 128cf5b 91ffe3f b8353a4 973f2a4 873a17c 7c41b61 873a17c ea89a85 7c41b61 973f2a4 4505ee4 973f2a4 873a17c ea89a85 873a17c ea89a85 873a17c 4505ee4 ea89a85 873a17c d49441f 873a17c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
---
language:
- ur
license: apache-2.0
tags:
- automatic-speech-recognition
- hf-asr-leaderboard
- robust-speech-event
datasets:
- mozilla-foundation/common_voice_7_0
metrics:
- wer
- cer
model-index:
- name: wav2vec2-60-urdu
results:
- task:
type: automatic-speech-recognition
name: Speech Recognition
dataset:
type: mozilla-foundation/common_voice_7_0
name: Common Voice ur
args: ur
metrics:
- type: wer
value: 59.1
name: Test WER
args:
learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 200
num_epochs: 50
mixed_precision_training: Native AMP
- type: cer
value: 33.1
name: Test CER
args:
learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 200
num_epochs: 50
mixed_precision_training: Native AMP
---
# wav2vec2-large-xlsr-53-urdu
This model is a fine-tuned version of [Harveenchadha/vakyansh-wav2vec2-urdu-urm-60](https://huggingface.co./Harveenchadha/vakyansh-wav2vec2-urdu-urm-60) on the common_voice dataset.
It achieves the following results on the evaluation set:
- Wer: 0.5913
- Cer: 0.3310
## Model description
The training and valid dataset is 0.58 hours. It was hard to train any model on lower number of so I decided to take vakyansh-wav2vec2-urdu-urm-60 checkpoint and finetune the wav2vec2 model.
## Training procedure
Trained on Harveenchadha/vakyansh-wav2vec2-urdu-urm-60 due to lesser number of samples.
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 200
- num_epochs: 50
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss | Wer | Cer |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|
| 12.6045 | 8.33 | 100 | 8.4997 | 0.6978 | 0.3923 |
| 1.3367 | 16.67 | 200 | 5.0015 | 0.6515 | 0.3556 |
| 0.5344 | 25.0 | 300 | 9.3687 | 0.6393 | 0.3625 |
| 0.2922 | 33.33 | 400 | 9.2381 | 0.6236 | 0.3432 |
| 0.1867 | 41.67 | 500 | 6.2150 | 0.6035 | 0.3448 |
| 0.1166 | 50.0 | 600 | 6.4496 | 0.5913 | 0.3310 |
### Framework versions
- Transformers 4.15.0
- Pytorch 1.10.0+cu111
- Datasets 1.17.0
- Tokenizers 0.10.3
|