--- language: - en tags: - speech-to-text - speech-translation - conversational-AI - speech-understanding - whisper license: apache-2.0 datasets: - custom metrics: - wer - bleu - AIR-Bench ---

# Soundwave: Less is More for Speech-Text Alignment in LLMs

🐈‍⬛ Github | 📃 Paper| 📼 Online Demo 

## Model Description Soundwave is a Speech-to-Text model that bridges the gap between speech and text. It is trained on just 10k hours of data and delivers exceptional performance in speech translation and AIR-Bench speech tasks. ### Key Features
## Usage Load the Soundwave model and run inference with your audio files as shown in the GitHub repository. # 📖 Citation ``` @article{zhang2025soundwave, title={Soundwave: Less is More for Speech-Text Alignment in LLMs}, author={Zhang, Yuhao and Liu, Zhiheng and Bu, Fan and Zhang, Ruiyu and Wang, Benyou and Li, Haizhou}, journal={arXiv preprint arXiv:2502.12900}, year={2025} } ```