Update README.md
Browse files
README.md
CHANGED
@@ -68,8 +68,11 @@ print("Prediction:", processor.batch_decode(predicted_ids))
|
|
68 |
print("Reference:", test_dataset["sentence"][:2])
|
69 |
```
|
70 |
|
|
|
|
|
71 |
Prediction: ['yaherukaga gukora igitaramo y iki mu jyiwa na mul mumbiliki', 'ini rero ntibizashoboka ka nibo nkunrabibzi']
|
72 |
Reference: ['Yaherukaga gukora igitaramo nk’iki mu Mujyi wa Namur mu Bubiligi.', 'Ibi rero, ntibizashoboka, kandi nawe arabizi.']
|
|
|
73 |
|
74 |
## Evaluation
|
75 |
|
@@ -154,6 +157,6 @@ print("WER: {:2f}".format(100 * chunked_wer(result["sentence"], result["pred_str
|
|
154 |
|
155 |
## Training
|
156 |
|
157 |
-
Blocks of examples from the Common Voice training dataset
|
158 |
|
159 |
The [script used for training](https://github.com/serapio/transformers/blob/feature/xlsr-finetune/examples/research_projects/wav2vec2/run_common_voice.py) is adapted from the [example script provided in the transformers repo](https://github.com/huggingface/transformers/blob/master/examples/research_projects/wav2vec2/run_common_voice.py).
|
|
|
68 |
print("Reference:", test_dataset["sentence"][:2])
|
69 |
```
|
70 |
|
71 |
+
Result:
|
72 |
+
```
|
73 |
Prediction: ['yaherukaga gukora igitaramo y iki mu jyiwa na mul mumbiliki', 'ini rero ntibizashoboka ka nibo nkunrabibzi']
|
74 |
Reference: ['Yaherukaga gukora igitaramo nk’iki mu Mujyi wa Namur mu Bubiligi.', 'Ibi rero, ntibizashoboka, kandi nawe arabizi.']
|
75 |
+
```
|
76 |
|
77 |
## Evaluation
|
78 |
|
|
|
157 |
|
158 |
## Training
|
159 |
|
160 |
+
Blocks of examples from the Common Voice training dataset were used for training, after filtering out utterances that had any `down_vote` or were longer than 9.5 seconds. The data used totals about 100k examples, 20% of the available data. Training proceeded for 30k global steps, on 1 V100 GPU provided by OVHcloud. For validation, 2048 examples of the validation dataset were used.
|
161 |
|
162 |
The [script used for training](https://github.com/serapio/transformers/blob/feature/xlsr-finetune/examples/research_projects/wav2vec2/run_common_voice.py) is adapted from the [example script provided in the transformers repo](https://github.com/huggingface/transformers/blob/master/examples/research_projects/wav2vec2/run_common_voice.py).
|