File size: 1,987 Bytes
255a4f2
 
 
 
 
0608af7
 
 
 
 
 
 
 
830dc58
31d3044
 
d7a6774
aec7571
f55b848
a15daf1
 
31d3044
 
fb7d6c4
 
 
53e20fe
 
31d3044
 
aada626
31d3044
 
aada626
 
 
 
 
 
 
 
 
 
 
 
 
 
445fbef
ace7055
31d3044
aada626
 
 
 
 
 
 
 
 
 
31d3044
53e20fe
 
aada626
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
datasets:
- lewtun/music_genres_small
base_model:
- facebook/wav2vec2-large
metrics:
- accuracy
- f1
tags:
- audio
- music
- classification
- Wav2Vec2
pipeline_tag: audio-classification
---

# Music Genre Classification Model 🎶
This model classifies music genres based on audio signals (.wav). 
It was fine-tuned on the model  **[Wav2Vec2](https://huggingface.co./facebook/wav2vec2-large)** and using the datasets **[music_genres_small](https://huggingface.co./datasets/lewtun/music_genres_small)**.

You can find a **GitHub** repository with an interface hosted by a Flask API to test the model:  **[music-classifier repository](https://github.com/gastonduault/Music-Classifier)**

## Metrics
- **Validation Accuracy**: 75%
- **F1 Score**: 74%
- **Validation Loss**: 0.77
  
## Example Usage
```python
from transformers import Wav2Vec2ForSequenceClassification, Wav2Vec2FeatureExtractor
import librosa
import torch

# Genre mapping corrected to a dictionary
genre_mapping = {
    0: "Electronic",
    1: "Rock",
    2: "Punk",
    3: "Experimental",
    4: "Hip-Hop",
    5: "Folk",
    6: "Chiptune / Glitch",
    7: "Instrumental",
    8: "Pop",
    9: "International",
}

model = Wav2Vec2ForSequenceClassification.from_pretrained("gastonduault/music-classifier")
feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained("facebook/wav2vec2-large")

# Function for preprocessing audio for prediction
def preprocess_audio(audio_path):
    audio_array, sampling_rate = librosa.load(audio_path, sr=16000)
    return feature_extractor(audio_array, sampling_rate=16000, return_tensors="pt", padding=True)

# Path to your audio file
audio_path = "./Nirvana - Come As You Are.wav"

# Preprocess audio
inputs = preprocess_audio(audio_path)

# Predict
with torch.no_grad():
    logits = model(**inputs).logits
    predicted_class = torch.argmax(logits, dim=-1).item()

# Output the result
print(f"song analized:{audio_path}")
print(f"Predicted genre: {genre_mapping[predicted_class]}")