Publications
Here, we list a collection of research articles that utilize the NeMo Toolkit. If you would like to include your paper in this collection, please submit a PR updating this document.
Automatic Speech Recognition (ASR)
2023
2021
- Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition
- SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
- CarneliNet: Neural Mixture Model for Automatic Speech Recognition
- CTC Variations Through New WFST Topologies
- A Toolbox for Construction and Analysis of Speech Datasets
2020
2019
Speaker Recognition (SpkR)
Speech Classification
2022
2021
Speech Translation
Natural Language Processing (NLP)
Language Modeling
2022
Neural Machine Translation
Dialogue State Tracking
--------Text To Speech (TTS)
2021
- TalkNet: Fully-Convolutional Non-Autoregressive Speech Synthesis Model
- TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction
- Hi-Fi Multi-Speaker English TTS Dataset
- Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings