Spaces:
Running
on
Zero
Running
on
Zero
Upload app.py
Browse files
app.py
CHANGED
@@ -467,14 +467,14 @@ with gr.Blocks() as lf_tts:
|
|
467 |
|
468 |
with gr.Blocks() as about:
|
469 |
gr.Markdown("""
|
470 |
-
Kokoro is a frontier TTS model for its size. It has 80 million parameters
|
471 |
|
472 |
### FAQ
|
473 |
#### Will this be open sourced?
|
474 |
-
There currently isn't a release date scheduled for the weights. The inference code in this space is MIT licensed. The architecture was already published by Li et al, with MIT licensed code and pretrained weights
|
475 |
|
476 |
#### What is the difference between stable and unstable voices?
|
477 |
-
|
478 |
|
479 |
#### How can CPU be faster than ZeroGPU?
|
480 |
The CPU is a dedicated resource for this Space, while the ZeroGPU pool is shared and dynamically allocated across all of HF. The ZeroGPU queue/allocator system inevitably adds latency to each request.<br/>
|
@@ -507,26 +507,31 @@ print(out_ps)
|
|
507 |
```
|
508 |
This Space and the underlying Kokoro model are both under development and subject to change. Reliability is not guaranteed. Hugging Face and Gradio might enforce their own rate limits.
|
509 |
|
510 |
-
### Model Version History
|
511 |
-
| Version | Date | Val mel / dur / f0 Losses |
|
512 |
-
| ------- | ---- | ------------------------- |
|
513 |
-
| v0.19 | 2024 Nov 22 | 0.261 / 0.627 / 1.897 |
|
514 |
-
| v0.16 | 2024 Nov 15 | 0.263 / 0.646 / 1.934 |
|
515 |
-
| v0.14 | 2024 Nov 12 | 0.262 / 0.642 / 1.889 |
|
516 |
-
|
517 |
### Licenses
|
518 |
Inference code: MIT<br/>
|
519 |
-
espeak-ng
|
520 |
-
Random English texts: Unknown
|
521 |
-
Random Japanese texts: CC0 public domain
|
522 |
-
|
523 |
-
|
524 |
-
|
525 |
-
|
526 |
-
|
527 |
-
|
528 |
-
|
529 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
530 |
""")
|
531 |
|
532 |
with gr.Blocks() as app:
|
|
|
467 |
|
468 |
with gr.Blocks() as about:
|
469 |
gr.Markdown("""
|
470 |
+
Kokoro is a frontier TTS model for its size. It has [80 million](https://hf.co/spaces/hexgrad/Kokoro-TTS/blob/main/app.py#L31) parameters, uses a lean [StyleTTS 2](https://github.com/yl4579/StyleTTS2) architecture, and was trained on high-quality data. The weights are currently private, but a free public demo is hosted here, at `https://hf.co/spaces/hexgrad/Kokoro-TTS`. The Community tab is open for feature requests, bug reports, etc. For other inquiries, contact `@rzvzn` on Discord.
|
471 |
|
472 |
### FAQ
|
473 |
#### Will this be open sourced?
|
474 |
+
There currently isn't a release date scheduled for the weights. The inference code in this space is MIT licensed. The architecture was already published by Li et al, with MIT licensed code and pretrained weights.
|
475 |
|
476 |
#### What is the difference between stable and unstable voices?
|
477 |
+
Unstable voices are more likely to stumble or produce unnatural artifacts, especially on short or strange texts. Stable voices are more likely to deliver natural speech on a wider range of inputs. The first two audio clips in this [blog post](https://hf.co/blog/hexgrad/kokoro-short-burst-upgrade) are examples of unstable and stable speech. Note that even unstable voices can sound fine on medium to long texts.
|
478 |
|
479 |
#### How can CPU be faster than ZeroGPU?
|
480 |
The CPU is a dedicated resource for this Space, while the ZeroGPU pool is shared and dynamically allocated across all of HF. The ZeroGPU queue/allocator system inevitably adds latency to each request.<br/>
|
|
|
507 |
```
|
508 |
This Space and the underlying Kokoro model are both under development and subject to change. Reliability is not guaranteed. Hugging Face and Gradio might enforce their own rate limits.
|
509 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
510 |
### Licenses
|
511 |
Inference code: MIT<br/>
|
512 |
+
[eSpeak NG](https://github.com/espeak-ng/espeak-ng): GPL-3.0<br/>
|
513 |
+
Random English texts: Unknown from [Quotable Data](https://github.com/quotable-io/data/blob/master/data/quotes.json)<br/>
|
514 |
+
Random Japanese texts: CC0 public domain from [Common Voice](https://github.com/common-voice/common-voice/tree/main/server/data/ja)
|
515 |
+
""")
|
516 |
+
|
517 |
+
with gr.Blocks() as changelog:
|
518 |
+
gr.Markdown("""
|
519 |
+
### 23 Nov 2024
|
520 |
+
🔀 Hardware switching between CPU and GPU
|
521 |
+
🗣️ Restored old voices, back up to 32 total
|
522 |
+
|
523 |
+
### 22 Nov 2024
|
524 |
+
🚀 Model v0.19
|
525 |
+
🧪 Validation losses: 0.261 mel / 0.627 dur / 1.897 f0
|
526 |
+
📝 https://hf.co/blog/hexgrad/kokoro-short-burst-upgrade
|
527 |
+
|
528 |
+
### 15 Nov 2024
|
529 |
+
🚀 Model v0.16
|
530 |
+
🧪 Validation losses: 0.263 mel / 0.646 dur / 1.934 f0
|
531 |
+
|
532 |
+
### 12 Nov 2024
|
533 |
+
🚀 Model v0.14
|
534 |
+
🧪 Validation losses: 0.262 mel / 0.642 dur / 1.889 f0
|
535 |
""")
|
536 |
|
537 |
with gr.Blocks() as app:
|