e-branchformer et
Espnet2 e-branchformer based recipe (https://github.com/espnet/espnet/tree/master/egs2/librispeech_100/asr1) trained Estonian ASR model using ERR2020 dataset
- WER on ERR2020: 9.9
- WER on mozilla commonvoice_11: 20.8
For usage:
- clone this repo (
git clone https://huggingface.co./rristo/espnet_ebranchformer_et
) - go to repo (
cd espnet_ebranchformer_et
) - build docker image for needed libraries (
build.sh
orbuild.bat
) - run docker container (
run.sh
orrun.sh
). This mounts current directory - run notebook
example_usage.ipynb
for example usage- currently expects audio to be in .wav format
Model description
ASR model for Estonian, uses Estonian Public Broadcasting data ERR2020 data (around 340 hours of audio)
Intended uses & limitations
Pretty much a toy model, trained on limited amount of data. Might not work well on data out of domain (especially spontaneous/noisy data).
Training and evaluation data
Trained on ERR2020 data, evaluated on ERR2020 and mozilla commonvoice test data.
Training procedure
Used espnet e-branchformer based recipe (https://github.com/espnet/espnet/tree/master/egs2/librispeech_100/asr1)
Training results
Look into folder exp/images.
Validation results are in exp/RESULTS.md
Framework versions
- espnet2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.