metadata
language:
- en
- de
- es
- it
- nl
- pt
- pl
- ro
- sv
- da
- fi
- hu
- el
- fr
- ru
- uk
- tr
- ar
- hi
- jp
- ko
- zh
- vi
- la
- ha
- sw
- yo
- wo
library: xvasynth
tags:
- emotion
- audio
- text-to-speech
- tts
pipeline_tag: text-to-speech
GitHub project: https://github.com/DanRuta/xVA-Synth
The base model for training other xVASynth's "xVAPitch" type models (v3). Model itself used by the xVATrainer TTS model training app. All created by Dan Ruta.
When used in xVASynth editor, it is an American Adult Male voice. Default pacing is too fast and has to be adjusted.
xVAPitch_5820651 model sample:
xVAPitch model referenced Papers:
- Multi-head attention with Relative Positional embedding - https://arxiv.org/pdf/1809.04281.pdf
- Transformer with Relative Potional Encoding- https://arxiv.org/abs/1803.02155
- SDP - https://arxiv.org/pdf/2106.06103.pdf
- Spline Flow - https://arxiv.org/abs/1906.04032
Used datasets: Unknown/Non-permissiable data