WillisDiarize-v1-quantized
Description
WillisDiarize-v1-quantized is an ensemble model for diarization correction as a post-processing step. It was fine-tuned using the Mistral-7B-Instruct v0.2 foundational model. This is the quantized version of the original model.
During fine-tuning, three separate automated speech recognition tools (namely AWS, Azure, and WhisperX) were used to generate the transcripts used. All fine-tuning was done on the Fisher corpus, a dataset of approximately 12,000 recorded conversations and their transcripts. For a full description of model development and performance testing, please read our preprint, Efstathiadis et al. (2024).
WillisDiarize is free to use for non-commercial purposes; see here for the full license text. If you are interested in using this model commercially, please [email protected].
Usage
Install OpenWillis to easily access this model on both AWS cloud or any high-performance GPU machine. Follow the installation steps here.
Citation
@article{bklynhlth/WillisDiarize,
title={LLM-based speaker diarization correction: A generalizable approach},
author={Efstathiadis et al.},
journal={arXiv preprint arXiv:2406.04927},
year={2024}
}
- Downloads last month
- 796