metadata

tags:
  - audio
  - text-to-speech
language: kbd
license: mit
datasets:
  - anzorq/kbd_speech
pipeline_tag: text-to-speech

KBD TTS Male Model

Install dependencies

pip install git+https://github.com/coqui-ai/TTS@dev#egg=TTS`
pip install gradio`

Usage

import os
from TTS.utils.download import download_url
from TTS.utils.synthesizer import Synthesizer
import tempfile

def download_model_and_config():
    dir_path = "kbd-vits-tts"
    if not os.path.exists(dir_path):
        os.makedirs(dir_path)
    model_url = "https://huggingface.co./anzorq/kbd-vits-tts-male/resolve/main/checkpoint_56000.pth"
    config_url = "https://huggingface.co./anzorq/kbd-vits-tts-male/resolve/main/config_35000.json"
    download_url(model_url, dir_path, "model.pth")
    download_url(config_url, dir_path, "config.json")
    return dir_path

model_dir = download_model_and_config()

def tts_male(text: str):
    synthesizer = Synthesizer(f"{model_dir}/model.pth", f"{model_dir}/config.json")
    text = text.replace("I", "ӏ") #replace capital 'i's with lowercase "Palochka" symbol
    wavs = synthesizer.tts(text)
    
    with tempfile.NamedTemporaryFile(suffix=".wav", delete=False) as fp:
        synthesizer.save_wav(wavs, fp)
        return fp.name

text = "Гупсыси псалъэ, зыплъыхьи тIыс"
output_path = tts_male(text)
print(f"Generated audio saved at: {output_path}")

This will generate an audio file using the male model and save it to a temporary file. The path to the generated audio file will be printed.

Note

The model was trained on text with the lowercase palochka symbol.

Make sure to replace "I"s and similar symbols with "ӏ" (lowercase palochka symbol) in the input text, as shown in the provided code.