File size: 2,048 Bytes
47c1ef0 7db8a79 47c1ef0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
---
tags:
- audio-to-audio
- text-to-speech
- speech-to-text
license: cc-by-nc-sa-4.0
language:
- zh
- en
- de
- ja
- fr
- es
- ko
- ar
pipeline_tag: audio-to-audio
inference: false
extra_gated_prompt: >-
You agree to not use the model to generate contents that violate DMCA or local
laws.
extra_gated_fields:
Country: country
Specific date: date_picker
I agree to use this model for non-commercial use ONLY: checkbox
---
# Fish Agent V0.1 3B
**Fish Agent V0.1 3B** is a groundbreaking Voice-to-Voice model capable of capturing and generating environmental audio information with unprecedented accuracy. What sets it apart is its semantic-token-free architecture, eliminating the need for traditional semantic encoders/decoders like Whisper and CosyVoice.
Additionally, it stands as a state-of-the-art text-to-speech (TTS) model, trained on an extensive dataset of 700,000 hours of multilingual audio content.
This model is a continue-pretrained version of Qwen-2.5-3B-Instruct for 200B voice & text tokens.
## Supported Languages
The model supports the following languages with their respective training data sizes:
- English (en): ~300,000 hours
- Chinese (zh): ~300,000 hours
- German (de): ~20,000 hours
- Japanese (ja): ~20,000 hours
- French (fr): ~20,000 hours
- Spanish (es): ~20,000 hours
- Korean (ko): ~20,000 hours
- Arabic (ar): ~20,000 hours
For detailed information and implementation guidelines, please visit our [Fish Speech GitHub repository](https://github.com/fishaudio/fish-speech).
## Citation
If you find this repository helpful in your work, please consider citing:
```bibtex
@misc{fish-agent-0.1,
author = {Shijia Liao and Tianyu Li and Rcell and others},
title = {Fish Agent V0.1 3B},
year = {2024},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/fishaudio/fish-speech}}
}
```
## License
This model and its associated code are released under the BY-CC-NC-SA-4.0 license, allowing for non-commercial use with appropriate attribution.
|