Neura Speech Nemo
Model Description
- Developed by: Neura company
- Funded by: Neura
- Model type: fa_FastConformers_Transducer
- Language(s) (NLP): Persian
Model Architecture
This model uses a FastConformer-TDT architecture. FastConformer [1] is an optimized version of the Conformer model with 8x depthwise-separable convolutional downsampling. You may find more information on the details of FastConformer here: Fast-Conformer Model. Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition.
Uses
Check out the Google Colab demo to run NeuraSpeech ASR on a free-tier Google Colab instance:
make sure these packages are installed:
!pip install nemo_toolkit['all']
from IPython.display import Audio, display
display(Audio('persian_audio.mp3', rate = 32_000,autoplay=True))
import nemo
print('nemo', nemo.__version__)
import numpy as np
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.EncDecRNNTBPEModel.from_pretrained(model_name="Neurai/NeuraSpeech_900h")
asr_model.transcribe(paths2audio_files=['persian_audio.mp3', ], batch_size=1)[0]
trascribed text :
او خواهان آزاد کردن بردگان بود
More Information
Model Card Authors
Esmaeil Zahedi, Mohsen Yazdinejad
Model Card Contact
Inference API (serverless) does not yet support Nvidia Nemo models for this pipeline type.