Text-to-Speech
English
hexgrad commited on
Commit
10319fe
·
1 Parent(s): 30618d0

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -6
README.md CHANGED
@@ -25,10 +25,16 @@ pipeline_tag: text-to-speech
25
 
26
  ### Releases
27
 
28
- | Model | Published | Training Data | Compute (A100 80GB) | Langs & Voices | SHA256 |
29
- | ----- | --------- | ------------- | ------------------- | -------------- | ------ |
30
- | **v1.0** | **2025 Jan 27** | **Few hundred hrs** | **$1000 for 1000 hrs** | [**8 & 54**](https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md) | `496dba11` |
31
- | [v0.19](https://huggingface.co/hexgrad/kLegacy/tree/main/v0.19) | 2024 Dec 25 | <100 hrs | $400 for 500 hrs | 1 & 10 | `3b0c392f` |
 
 
 
 
 
 
32
 
33
  ### Usage
34
 
@@ -105,8 +111,6 @@ Under the hood, `kokoro` uses [`misaki`](https://pypi.org/project/misaki/), a G2
105
 
106
  ### Training Details
107
 
108
- **Compute:** About $1000 for 1000 hours of A100 80GB vRAM
109
-
110
  **Data:** Kokoro was trained exclusively on **permissive/non-copyrighted audio data** and IPA phoneme labels. Examples of permissive/non-copyrighted audio include:
111
  - Public domain audio
112
  - Audio licensed under Apache, MIT, etc
@@ -116,6 +120,8 @@ Under the hood, `kokoro` uses [`misaki`](https://pypi.org/project/misaki/), a G2
116
 
117
  **Total Dataset Size:** A few hundred hours of audio
118
 
 
 
119
  ### Creative Commons Attribution
120
 
121
  The following CC BY audio was part of the dataset used to train Kokoro v1.0.
 
25
 
26
  ### Releases
27
 
28
+ | Model | Published | Training Data | Langs & Voices | SHA256 |
29
+ | ----- | --------- | ------------- | -------------- | ------ |
30
+ | [v0.19](https://huggingface.co/hexgrad/kLegacy/tree/main/v0.19) | 2024 Dec 25 | <100 hrs | 1 & 10 | `3b0c392f` |
31
+ | **v1.0** | **2025 Jan 27** | **Few hundred hrs** | [**8 & 54**](https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md) | `496dba11` |
32
+
33
+ | Training Costs | v0.19 | v1.0 | **Total** |
34
+ | -------------- | ----- | ---- | ----- |
35
+ | in A100 80GB GPU hours | 500 | 500 | **1000** |
36
+ | average hourly rate | $0.80/h | $1.20/h | **$1/h** |
37
+ | in USD | $400 | $600 | **$1000** |
38
 
39
  ### Usage
40
 
 
111
 
112
  ### Training Details
113
 
 
 
114
  **Data:** Kokoro was trained exclusively on **permissive/non-copyrighted audio data** and IPA phoneme labels. Examples of permissive/non-copyrighted audio include:
115
  - Public domain audio
116
  - Audio licensed under Apache, MIT, etc
 
120
 
121
  **Total Dataset Size:** A few hundred hours of audio
122
 
123
+ **Total Training Cost:** About $1000 for 1000 hours of A100 80GB vRAM
124
+
125
  ### Creative Commons Attribution
126
 
127
  The following CC BY audio was part of the dataset used to train Kokoro v1.0.