Text-to-Speech
PyTorch
ONNX
Catalan
matcha-tts
acoustic modelling
speech
multispeaker
AlexK-PL commited on
Commit
6a5a7ed
1 Parent(s): f08fdc4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -3
README.md CHANGED
@@ -33,9 +33,10 @@ datasets:
33
 
34
  ## Model description
35
 
36
- Matcha-TTS is an encoder-decoder architecture designed for fast acoustic modelling in TTS. The encoder predicts phoneme durations and its mean feature vectors
37
- modelling alignment with Monotonic Alignment Search (MOS). And the decoder is essentially a U-Net inspired by Grad-TTS, that is based on Transformers architecture combined
38
- with 1D instead of 2D CNNs, making a high reduction on memory consumption and speedy synthesis.
 
39
  Matcha-TTS is non-autorregressive and is trained using optimal-transport conditional flow matching (OT-CFM).
40
  This yields an ODE-based decoder capable of high output quality in fewer synthesis steps than models trained using score matching.
41
 
@@ -99,6 +100,20 @@ Data comes from two different datasets: festcat and openslr69
99
 
100
  ### Results
101
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
 
103
  ## Additional information
104
 
 
33
 
34
  ## Model description
35
 
36
+ Matcha-TTS is an encoder-decoder architecture designed for fast acoustic modelling in TTS. The encoder predicts phoneme durations and its mean feature vectors.
37
+ And the decoder is essentially a U-Net inspired by Grad-TTS, that is based on Transformers architecture combined
38
+ with 1D instead of 2D CNNs, making a high reduction on memory consumption and speedy synthesis.
39
+
40
  Matcha-TTS is non-autorregressive and is trained using optimal-transport conditional flow matching (OT-CFM).
41
  This yields an ODE-based decoder capable of high output quality in fewer synthesis steps than models trained using score matching.
42
 
 
100
 
101
  ### Results
102
 
103
+ ## Citation
104
+
105
+ If this code contributes to your research, please cite the work:
106
+
107
+ ```
108
+ @misc{mehta2024matchatts,
109
+ title={Matcha-TTS: A fast TTS architecture with conditional flow matching},
110
+ author={Shivam Mehta and Ruibo Tu and Jonas Beskow and Éva Székely and Gustav Eje Henter},
111
+ year={2024},
112
+ eprint={2309.03199},
113
+ archivePrefix={arXiv},
114
+ primaryClass={eess.AS}
115
+ }
116
+ ```
117
 
118
  ## Additional information
119