kkr2 / README.md
hivecorp's picture
Update README.md
88ee47e verified

A newer version of the Gradio SDK is available: 5.13.1

Upgrade
metadata
license: mit
title: KKR2
sdk: gradio
colorFrom: blue
colorTo: green

Text-to-Speech App with Kokoro-82M-ONNX

This is a Gradio-based text-to-speech (TTS) app that uses the Kokoro-82M-ONNX model from Hugging Face. The app allows you to generate speech from text with multiple speaker options and download the resulting audio file.

Features

  • Text-to-Speech Conversion: Convert any input text into speech.
  • Multiple Speakers: Choose from different speaker voices.
  • Download Audio: Download the generated speech as a .wav file.

How to Use

  1. Enter Text: Type or paste your text into the input box.
  2. Select Speaker: Choose a speaker from the dropdown menu.
  3. Generate Speech: Click the "Submit" button to generate the speech.
  4. Download Audio: Once the speech is generated, you can listen to it or download the .wav file.

Example Inputs

  • Text: "Hello, welcome to the text-to-speech app!"
  • Speaker: "Speaker 1"

Requirements

The app requires the following Python packages:

  • onnxruntime
  • torch
  • gradio
  • scipy
  • numpy
  • huggingface_hub

These dependencies are automatically installed when the Space is built.

Model Details

The app uses the Kokoro-82M-ONNX model, a lightweight and efficient text-to-speech model in ONNX format. The model supports multiple speakers and generates high-quality speech.

Limitations

  • The model may not handle very long texts efficiently.
  • Speaker options are limited to the embeddings supported by the model.

Feedback and Contributions

If you encounter any issues or have suggestions for improvement, please open an issue on the GitHub repository or contact me directly.


Enjoy using the app! 🎉