ai4bharat
/

indic-parler-tts-pretrained

@@ -39,7 +39,7 @@ datasets:
 **Indic Parler-TTS** is a multilingual Indic extension of [Parler-TTS Mini](https://huggingface.co/parler-tts/parler-tts-mini-v1.1).
-It is a fine-tuned version, trained on a **8,385 hours** multilingual Indic and English dataset.
 **Indic Parler-TTS Mini** can officially speak in 20 Indic languages, making it comprehensive for regional language technologies, and in English. The **21 languages** supported are: Assamese, Bengali, Bodo, Dogri, English, Gujarati, Hindi, Kannada, Konkani, Maithili, Malayalam, Manipuri, Marathi, Nepali, Odia, Sanskrit, Santali, Sindhi, Tamil, Telugu, and Urdu.
@@ -93,7 +93,7 @@ The model accepts two primary inputs:
    - For other accents, the model allows customization by specifying accent details, such as "A male British speaker" or "A female American speaker," using style transfer for more dynamic and personalized outputs.
 5. **Customizable Output**
-   IndicParlerTTS offers precise control over various speech characteristics using the **caption** input:
    - **Background Noise**: Adjust the noise level in the audio, from clear to slightly noisy environments.
    - **Reverberation**: Control the perceived distance of the voice, from close-sounding to distant-sounding speech.
@@ -107,7 +107,7 @@ The model accepts two primary inputs:
 🚨 Unlike previous versions of Parler-TTS, here we use two tokenizers - one for the prompt and one for the description. 🚨
-**Parler-TTS** has been trained to generate speech with features that can be controlled with a simple text prompt, for example:
 ```py
 import torch
@@ -132,7 +132,7 @@ audio_arr = generation.cpu().numpy().squeeze()
 sf.write("indic_tts_out.wav", audio_arr, model.config.sampling_rate)
 ```
-IndicParlerTTS provides highly effective control over key aspects of speech synthesis using descriptive captions. Below is a summary of what each control parameter can achieve:
 | **Control Type**        | **Capabilities**                                                                 |
 |--------------------------|----------------------------------------------------------------------------------|
@@ -146,7 +146,7 @@ IndicParlerTTS provides highly effective control over key aspects of speech synt
 ## 🌍 Switching languages
-The template automatically adapts to the language it detects in the prompt. You don't need to specify the language you want to use. For example, to switch to Hindi, simply use an Hindi prompt:
 ```py
 import torch
@@ -266,7 +266,7 @@ Here is the table based on the provided data:
 ## 📐 Evaluation
-IndicParlerTTS has been evaluated using a MOS-like framework by native and non-native speakers. The results highlight its exceptional performance in generating natural and intelligible speech, especially for native speakers of Indian languages.
 | **Language** | **Native Speaker Score (%)** | **Highlights**                                                                                     |
 |--------------|-------------------------------|--------------------------------------------------------------------------------------------------|

 **Indic Parler-TTS** is a multilingual Indic extension of [Parler-TTS Mini](https://huggingface.co/parler-tts/parler-tts-mini-v1.1).
+It is a fine-tuned version of [Parler-TTS Mini v1.1](https://huggingface.co/parler-tts/parler-tts-mini-v1.1), trained on a **8,385 hours** multilingual Indic and English dataset.
 **Indic Parler-TTS Mini** can officially speak in 20 Indic languages, making it comprehensive for regional language technologies, and in English. The **21 languages** supported are: Assamese, Bengali, Bodo, Dogri, English, Gujarati, Hindi, Kannada, Konkani, Maithili, Malayalam, Manipuri, Marathi, Nepali, Odia, Sanskrit, Santali, Sindhi, Tamil, Telugu, and Urdu.
    - For other accents, the model allows customization by specifying accent details, such as "A male British speaker" or "A female American speaker," using style transfer for more dynamic and personalized outputs.
 5. **Customizable Output**
+   Indic Parler-TTS offers precise control over various speech characteristics using the **caption** input:
    - **Background Noise**: Adjust the noise level in the audio, from clear to slightly noisy environments.
    - **Reverberation**: Control the perceived distance of the voice, from close-sounding to distant-sounding speech.
 🚨 Unlike previous versions of Parler-TTS, here we use two tokenizers - one for the prompt and one for the description. 🚨
+**Indic Parler-TTS** has been trained to generate speech with features that can be controlled with a simple text prompt, for example:
 ```py
 import torch
 sf.write("indic_tts_out.wav", audio_arr, model.config.sampling_rate)
 ```
+Indic Parler-TTS provides highly effective control over key aspects of speech synthesis using descriptive captions. Below is a summary of what each control parameter can achieve:
 | **Control Type**        | **Capabilities**                                                                 |
 |--------------------------|----------------------------------------------------------------------------------|
 ## 🌍 Switching languages
+The model automatically adapts to the language it detects in the prompt. You don't need to specify the language you want to use. For example, to switch to Hindi, simply use an Hindi prompt:
 ```py
 import torch
 ## 📐 Evaluation
+Indic Parler-TTS has been evaluated using a MOS-like framework by native and non-native speakers. The results highlight its exceptional performance in generating natural and intelligible speech, especially for native speakers of Indian languages.
 | **Language** | **Native Speaker Score (%)** | **Highlights**                                                                                     |
 |--------------|-------------------------------|--------------------------------------------------------------------------------------------------|