hubertsiuzdak
commited on
Commit
โข
8f01a00
1
Parent(s):
ee450a3
Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,9 @@ tags:
|
|
8 |
|
9 |
Multi-**S**cale **N**eural **A**udio **C**odec (SNAC) compressess audio into discrete codes at a low bitrate.
|
10 |
|
11 |
-
See
|
|
|
|
|
12 |
|
13 |
## Overview
|
14 |
|
@@ -18,6 +20,16 @@ covering a broader time span.
|
|
18 |
This model compresses 44 kHz audio into discrete codes at a 2.6 kbps bitrate. It uses 4 RVQ levels with token rates of 14, 29, 57, and
|
19 |
115 Hz.
|
20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
## Usage
|
22 |
|
23 |
Install it using:
|
|
|
8 |
|
9 |
Multi-**S**cale **N**eural **A**udio **C**odec (SNAC) compressess audio into discrete codes at a low bitrate.
|
10 |
|
11 |
+
๐ This model was primarily trained on music data, and its recommended use case is music (and SFX) generation. See below for other pretrained models.
|
12 |
+
|
13 |
+
๐ GitHub repository: https://github.com/hubertsiuzdak/snac/
|
14 |
|
15 |
## Overview
|
16 |
|
|
|
20 |
This model compresses 44 kHz audio into discrete codes at a 2.6 kbps bitrate. It uses 4 RVQ levels with token rates of 14, 29, 57, and
|
21 |
115 Hz.
|
22 |
|
23 |
+
## Pretrained models
|
24 |
+
|
25 |
+
Currently, all models support only single audio channel (mono).
|
26 |
+
|
27 |
+
| Model | Bitrate | Sample Rate | Params | Recommended use case |
|
28 |
+
|-----------------------------------------------------------------------------|-----------|-------------|--------|--------------------------|
|
29 |
+
| [hubertsiuzdak/snac_24khz](https://huggingface.co/hubertsiuzdak/snac_24khz) | 0.98 kbps | 24 kHz | 19.8 M | ๐ฃ๏ธ Speech |
|
30 |
+
| [hubertsiuzdak/snac_32khz](https://huggingface.co/hubertsiuzdak/snac_32khz) | 1.9 kbps | 32 kHz | 54.5 M | ๐ธ Music / Sound Effects |
|
31 |
+
| hubertsiuzdak/snac_44khz (this model) | 2.6 kbps | 44 kHz | 54.5 M | ๐ธ Music / Sound Effects |
|
32 |
+
|
33 |
## Usage
|
34 |
|
35 |
Install it using:
|