Text-to-Speech
English
hexgrad commited on
Commit
94519cf
ยท
1 Parent(s): 9e1d2c9

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +1 -1
  2. VOICES.md +15 -9
README.md CHANGED
@@ -24,7 +24,7 @@ pipeline_tag: text-to-speech
24
 
25
  | Model | Published | Training Data | Compute (A100 80GB) | Langs & Voices | SHA256 |
26
  | ----- | --------- | ------------- | ------------------- | -------------- | ------ |
27
- | **v1.0** | **2025 Jan 27** | **Few hundred hrs** | **$1000 for 1000 hrs** | [**6 & 46**](https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md) | `496dba11` |
28
  | [v0.19](https://huggingface.co/hexgrad/kLegacy/tree/main/v0.19) | 2024 Dec 25 | <100 hrs | $400 for 500 hrs | 1 & 10 | `3b0c392f` |
29
 
30
  ### Usage
 
24
 
25
  | Model | Published | Training Data | Compute (A100 80GB) | Langs & Voices | SHA256 |
26
  | ----- | --------- | ------------- | ------------------- | -------------- | ------ |
27
+ | **v1.0** | **2025 Jan 27** | **Few hundred hrs** | **$1000 for 1000 hrs** | [**6 & 47**](https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md) | `496dba11` |
28
  | [v0.19](https://huggingface.co/hexgrad/kLegacy/tree/main/v0.19) | 2024 Dec 25 | <100 hrs | $400 for 500 hrs | 1 & 10 | `3b0c392f` |
29
 
30
  ### Usage
VOICES.md CHANGED
@@ -10,14 +10,15 @@ Subjectively, voices will sound better or worse to different people.
10
 
11
  **Training Duration**
12
  - How much audio was seen during training? Smaller durations result in a lower overall grade.
13
- - 10 hours <= HH hours < 100 hours
14
  - 1 hour <= H hours < 10 hours
15
  - 10 minutes <= MM minutes < 100 minutes
16
- - 1 minute <= M minutes < 10 minutes
17
 
18
  ### American English ๐Ÿ‡บ๐Ÿ‡ธ
19
 
20
- - [`misaki[en]`](https://github.com/hexgrad/misaki) `lang_code='a'` with `en-us` espeak-ng fallback
 
21
 
22
  | Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |
23
  | ---- | ------ | -------------- | ----------------- | ------------- | ------ |
@@ -30,7 +31,7 @@ Subjectively, voices will sound better or worse to different people.
30
  | af_nova | ๐Ÿšบ | B | MM minutes | C | `e0233676` |
31
  | af_river | ๐Ÿšบ | C | MM minutes | D | `e149459b` |
32
  | af_sarah | ๐Ÿšบ | B | H hours | C+ | `49bd364e` |
33
- | af_sky | ๐Ÿšบ | B | M minutes | C- | `c799548a` |
34
  | am_adam | ๐Ÿšน | D | H hours | F+ | `ced7e284` |
35
  | am_echo | ๐Ÿšน | C | MM minutes | D | `8bcfdc85` |
36
  | am_eric | ๐Ÿšน | C | MM minutes | D | `ada66f0e` |
@@ -39,10 +40,12 @@ Subjectively, voices will sound better or worse to different people.
39
  | am_michael | ๐Ÿšน | B | H hours | C+ | `9a443b79` |
40
  | am_onyx | ๐Ÿšน | C | MM minutes | D | `e8452be1` |
41
  | am_puck | ๐Ÿšน | B | H hours | C+ | `dd1d8973` |
 
42
 
43
  ### British English ๐Ÿ‡ฌ๐Ÿ‡ง
44
 
45
- - [`misaki[en]`](https://github.com/hexgrad/misaki) `lang_code='b'` with `en-gb` espeak-ng fallback
 
46
 
47
  | Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |
48
  | ---- | ------ | -------------- | ----------------- | ------------- | ------ |
@@ -57,6 +60,7 @@ Subjectively, voices will sound better or worse to different people.
57
 
58
  ### French ๐Ÿ‡ซ๐Ÿ‡ท
59
 
 
60
  - espeak-ng `fr-fr`
61
  - Total French training data: <11 hours
62
 
@@ -66,6 +70,7 @@ Subjectively, voices will sound better or worse to different people.
66
 
67
  ### Hindi ๐Ÿ‡ฎ๐Ÿ‡ณ
68
 
 
69
  - espeak-ng `hi`
70
  - Total Hindi training data: H hours
71
 
@@ -78,6 +83,7 @@ Subjectively, voices will sound better or worse to different people.
78
 
79
  ### Italian ๐Ÿ‡ฎ๐Ÿ‡ณ
80
 
 
81
  - espeak-ng `it`
82
  - Total Italian training data: H hours
83
 
@@ -88,20 +94,20 @@ Subjectively, voices will sound better or worse to different people.
88
 
89
  ### Japanese ๐Ÿ‡ฏ๐Ÿ‡ต
90
 
91
- - [`misaki[ja]`](https://github.com/hexgrad/misaki)
92
  - Total Japanese training data: H hours
93
 
94
  | Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 | CC BY |
95
  | ---- | ------ | -------------- | ----------------- | ------------- | ------ | ----- |
96
  | jf_alpha | ๐Ÿšบ | B | H hours | C+ | `1bf4c9dc` | |
97
  | jf_gongitsune | ๐Ÿšบ | B | MM minutes | C | `1b171917` | [gongitsune](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__gongitsune.txt) |
98
- | jf_nezumi | ๐Ÿšบ | B | M minutes | C- | `d83f007a` | [nezuminoyomeiri](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__nezuminoyomeiri.txt) |
99
  | jf_tebukuro | ๐Ÿšบ | B | MM minutes | C | `0d691790` | [tebukurowokaini](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__tebukurowokaini.txt) |
100
- | jm_kumo | ๐Ÿšน | B | M minutes | C- | `98340afd` | [kumonoito](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__kumonoito.txt) |
101
 
102
  ### Mandarin Chinese ๐Ÿ‡จ๐Ÿ‡ณ
103
 
104
- - [`misaki[zh]`](https://github.com/hexgrad/misaki)
105
  - Total Mandarin Chinese training data: H hours
106
 
107
  | Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |
 
10
 
11
  **Training Duration**
12
  - How much audio was seen during training? Smaller durations result in a lower overall grade.
13
+ - 10 hours <= **HH hours** < 100 hours
14
  - 1 hour <= H hours < 10 hours
15
  - 10 minutes <= MM minutes < 100 minutes
16
+ - 1 minute <= _M minutes_ < 10 minutes ๐Ÿค
17
 
18
  ### American English ๐Ÿ‡บ๐Ÿ‡ธ
19
 
20
+ - `lang_code='a'` in [`misaki[en]`](https://github.com/hexgrad/misaki)
21
+ - espeak-ng `en-us` fallback
22
 
23
  | Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |
24
  | ---- | ------ | -------------- | ----------------- | ------------- | ------ |
 
31
  | af_nova | ๐Ÿšบ | B | MM minutes | C | `e0233676` |
32
  | af_river | ๐Ÿšบ | C | MM minutes | D | `e149459b` |
33
  | af_sarah | ๐Ÿšบ | B | H hours | C+ | `49bd364e` |
34
+ | af_sky | ๐Ÿšบ๐Ÿค | B | _M minutes_ | C- | `c799548a` |
35
  | am_adam | ๐Ÿšน | D | H hours | F+ | `ced7e284` |
36
  | am_echo | ๐Ÿšน | C | MM minutes | D | `8bcfdc85` |
37
  | am_eric | ๐Ÿšน | C | MM minutes | D | `ada66f0e` |
 
40
  | am_michael | ๐Ÿšน | B | H hours | C+ | `9a443b79` |
41
  | am_onyx | ๐Ÿšน | C | MM minutes | D | `e8452be1` |
42
  | am_puck | ๐Ÿšน | B | H hours | C+ | `dd1d8973` |
43
+ | am_santa | ๐Ÿšน๐Ÿค | C | _M minutes_ | D- | `7f2f7582` |
44
 
45
  ### British English ๐Ÿ‡ฌ๐Ÿ‡ง
46
 
47
+ - `lang_code='b'` in [`misaki[en]`](https://github.com/hexgrad/misaki)
48
+ - espeak-ng `en-gb` fallback
49
 
50
  | Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |
51
  | ---- | ------ | -------------- | ----------------- | ------------- | ------ |
 
60
 
61
  ### French ๐Ÿ‡ซ๐Ÿ‡ท
62
 
63
+ - `lang_code='f'` in [`misaki[en]`](https://github.com/hexgrad/misaki)
64
  - espeak-ng `fr-fr`
65
  - Total French training data: <11 hours
66
 
 
70
 
71
  ### Hindi ๐Ÿ‡ฎ๐Ÿ‡ณ
72
 
73
+ - `lang_code='h'` in [`misaki[en]`](https://github.com/hexgrad/misaki)
74
  - espeak-ng `hi`
75
  - Total Hindi training data: H hours
76
 
 
83
 
84
  ### Italian ๐Ÿ‡ฎ๐Ÿ‡ณ
85
 
86
+ - `lang_code='i'` in [`misaki[en]`](https://github.com/hexgrad/misaki)
87
  - espeak-ng `it`
88
  - Total Italian training data: H hours
89
 
 
94
 
95
  ### Japanese ๐Ÿ‡ฏ๐Ÿ‡ต
96
 
97
+ - `lang_code='j'` in [`misaki[ja]`](https://github.com/hexgrad/misaki)
98
  - Total Japanese training data: H hours
99
 
100
  | Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 | CC BY |
101
  | ---- | ------ | -------------- | ----------------- | ------------- | ------ | ----- |
102
  | jf_alpha | ๐Ÿšบ | B | H hours | C+ | `1bf4c9dc` | |
103
  | jf_gongitsune | ๐Ÿšบ | B | MM minutes | C | `1b171917` | [gongitsune](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__gongitsune.txt) |
104
+ | jf_nezumi | ๐Ÿšบ๐Ÿค | B | _M minutes_ | C- | `d83f007a` | [nezuminoyomeiri](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__nezuminoyomeiri.txt) |
105
  | jf_tebukuro | ๐Ÿšบ | B | MM minutes | C | `0d691790` | [tebukurowokaini](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__tebukurowokaini.txt) |
106
+ | jm_kumo | ๐Ÿšน๐Ÿค | B | _M minutes_ | C- | `98340afd` | [kumonoito](https://github.com/koniwa/koniwa/blob/master/source/tnc/tnc__kumonoito.txt) |
107
 
108
  ### Mandarin Chinese ๐Ÿ‡จ๐Ÿ‡ณ
109
 
110
+ - `lang_code='z'` in [`misaki[zh]`](https://github.com/hexgrad/misaki)
111
  - Total Mandarin Chinese training data: H hours
112
 
113
  | Name | Traits | Target Quality | Training Duration | Overall Grade | SHA256 |