Spaces:

robinhad
/

ukrainian-tts

Running

App Files Files Community

ukrainian-tts / training /STEPS.md

Yurii Paniv

Add joint tacotron2_hifigan config

c527edf over 1 year ago

preview code

raw

history blame

1.9 kB

	# Setup environment
	Link: https://espnet.github.io/espnet/installation.html

	```sh
	sudo apt-get install cmake sox libsndfile1-dev ffmpeg
	git clone --branch v.202301 https://github.com/espnet/espnet
	cd ./espnet/tools
	./setup_anaconda.sh anaconda espnet 3.10
	. ./activate_python.sh
	make
	pip install --upgrade torch torchaudio # or setup same versions
	make
	. ./activate_python.sh; python3 check_install.py
	```

	# Run training

	ESPNET is a dynamic framework. For the latest guide, please refer to https://github.com/espnet/espnet/tree/master/egs2/TEMPLATE/tts1

	This page provides general launching steps on how training was performed for reference, and this doesn't cover data preparation.

	NOTE: before running the script below, copy [./train_vits.yaml](./train_vits.yaml) or [./finetune_joint_tacotron2_hifigan.yaml](./finetune_joint_tacotron2_hifigan.yaml) to your `<espnet_root>/egs2/ljspeech/tts1/conf/tuning/` folder


	```sh
	cd ../egs2/ljspeech/tts1
	pip install torchvision # to save figures
	pip install speechbrain # for x-vectors
	# option 1: train VITS
	./run.sh \
	--stage 6 \
	--min_wav_duration 0.38 \
	--use_xvector true \
	--xvector_tool speechbrain \
	--fs 22050 \
	--n_fft 1024 \
	--n_shift 256 \
	--win_length null \
	--dumpdir dump/22k \
	--expdir exp/22k \
	--tts_task gan_tts \
	--feats_extract linear_spectrogram \
	--feats_normalize none \
	--train_config ./conf/tuning/train_vits.yaml \
	--inference_config ./conf/tuning/decode_vits.yaml
	# option 2: train tacotron2 and hifigan jointly
	./run.sh \
	--stage 6 \
	--min_wav_duration 0.38 \
	--use_xvector true \
	--xvector_tool speechbrain \
	--fs 22050 \
	--n_fft 1024 \
	--n_shift 256 \
	--win_length null \
	--dumpdir dump/22k \
	--expdir exp/22k \
	--train_config ./conf/tuning/finetune_joint_tacotron2_hifigan.yaml \
	--tts_task gan_tts

	```