---
license:
- mit
- apache-2.0
language:
- en
library_name: transformers
pipeline_tag: audio-classification
tags:
- audio
- tts
---

# StyleTTS 2 Detector

This is a model trained for audio classification on a dataset of almost 10,000 samples of human and StyleTTS 2-generated audio clips. The model is based on [Whisper](https://huggingface.co./openai/whisper-base).

**NOTE: This model is not affiliated with the author(s) of StyleTTS 2 in any way.**

## Online Demo

An online demo is available [here](https://huggingface.co./spaces/mrfakename/styletts2-detector).

## Usage

**IMPORTANT:** Please read the license, disclaimer, and model card before using the model. You may not use the model if you do not agree to the license and disclaimer.

```python
from transformers import pipeline
import torch

pipe = pipeline('audio-classification', model='mrfakename/styletts2-detector', device='cuda' if torch.cuda.is_available() else 'cpu')

result = pipe('audio.wav')

print(result)
```

## Tags

The audio will be classified as either `real` or `fake` (human-generated or StyleTTS 2-spoken).

## Disclaimer

The author(s) of this model cannot guarantee complete accuracy. False positives or negatives may occur.

Usage of this model should not replace other precautions, such as invisible watermarking or audio watermarking.

This model has been trained on outputs from the StyleTTS 2 base model, not fine-tunes. The model may not identify fine-tunes properly.

The author(s) of this model disclaim all liability related to or in connection with the usage of this model.

## License

You may use this model under the **MIT** or **Apache 2.0** license, at your choice, so long as you include the disclaimer above in all redistributions, and require all future redistributions to include the disclaimer.