README.md · Pendrokar/xvapitch at 20c3e6b97a248fb9eb0285b858a16f7c053ef390

metadata

language:
  - en
  - de
  - es
  - it
  - nl
  - pt
  - pl
  - ro
  - sv
  - da
  - fi
  - hu
  - el
  - fr
  - ru
  - uk
  - tr
  - ar
  - hi
  - jp
  - ko
  - zh
  - vi
  - la
  - ha
  - sw
  - yo
  - wo
library: xvasynth
tags:
  - emotion
  - audio
  - text-to-speech
  - tts
pipeline_tag: text-to-speech

GitHub project: https://github.com/DanRuta/xVA-Synth

The base model for training other xVASynth's "xVAPitch" type models (v3). Model itself used by the xVATrainer TTS model training app. All created by Dan Ruta.

When used in xVASynth editor, it is an American Adult Male voice. Default pacing is too fast and has to be adjusted.

xVAPitch_5820651 model sample:

xVAPitch model referenced Papers:

Multi-head attention with Relative Positional embedding - https://arxiv.org/pdf/1809.04281.pdf
Transformer with Relative Potional Encoding- https://arxiv.org/abs/1803.02155
SDP - https://arxiv.org/pdf/2106.06103.pdf
Spline Flow - https://arxiv.org/abs/1906.04032

Used datasets: Unknown/Non-permissiable data