speechbrain
/

hifigan-wavlm-k1000-LibriTTS

speech-synthesis

Model card Files Files and versions Community

chaanks commited on Nov 21, 2024

Commit

05f1ea8

·

verified ·

1 Parent(s): 9afea3c

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ datasets:
 This repository provides all the necessary tools for using a [scalable HiFiGAN Unit](https://arxiv.org/abs/2406.10735) vocoder trained with [LibriTTS](https://www.openslr.org/141/).
 The pre-trained model take as input discrete self-supervised representations and produces a waveform as output. This is suitable for a wide range of generative tasks such as speech enhancement, separation, text-to-speech, voice cloning, etc. Please read [DASB - Discrete Audio and Speech Benchmark](https://arxiv.org/abs/2406.14294) for more information.
-To generate the discrete self-supervised representations, we employ a K-means clustering model trained using `microsoft/wavlm-large` hidden layers, with k=1000.
 ## Install SpeechBrain

 This repository provides all the necessary tools for using a [scalable HiFiGAN Unit](https://arxiv.org/abs/2406.10735) vocoder trained with [LibriTTS](https://www.openslr.org/141/).
 The pre-trained model take as input discrete self-supervised representations and produces a waveform as output. This is suitable for a wide range of generative tasks such as speech enhancement, separation, text-to-speech, voice cloning, etc. Please read [DASB - Discrete Audio and Speech Benchmark](https://arxiv.org/abs/2406.14294) for more information.
+To generate the discrete self-supervised representations, we employ a K-means clustering model trained using `microsoft/wavlm-large` hidden layers ([1, 3, 7, 12, 18, 23]), with k=1000.
 ## Install SpeechBrain