Audio Classification
English
music
art
admin commited on
Commit
87126d1
·
1 Parent(s): 7ecd01a
Files changed (1) hide show
  1. README.md +8 -21
README.md CHANGED
@@ -15,22 +15,22 @@ tags:
15
  The music genre classification model is fine-tuned based on a pre-trained model from the computer vision (CV) domain, aiming to classify audio data into different genres. During the pre-training phase, the model learns rich feature representations using a large-scale dataset from computer vision tasks. Through transfer learning, these learned features are applied to the music genre classification task to enhance the model's performance on audio data. In the fine-tuning phase, an audio dataset containing 16 music genre categories is utilized. These audio samples are first transformed into spectrograms, converting the temporal audio signal into a two-dimensional representation in the time and frequency dimensions. The spectrogram representation captures the temporal evolution of different audio frequencies, providing the model with rich information about the audio content. Through fine-tuning, adjustments are made to the pre-trained model to meet the requirements of the music genre classification task. The model learns to extract features from spectrograms that are relevant to music genre, enabling accurate classification of audio samples. This process enables the model to recognize and infer music genres, such as rock, classical, pop, among others. By combining a pre-trained model from the computer vision domain with an audio task, this approach leverages cross-modal knowledge transfer, demonstrating the adaptability and effectiveness of pre-trained models across different domains.
16
 
17
  ## Demo
18
- <https://huggingface.co/spaces/ccmusic-database/music-genre>
19
 
20
  ## Usage
21
  ```python
22
  from modelscope import snapshot_download
23
- model_dir = snapshot_download('ccmusic-database/music_genre')
24
  ```
25
 
26
  ## Maintenance
27
  ```bash
28
- GIT_LFS_SKIP_SMUDGE=1 git clone [email protected]:ccmusic-database/music_genre
29
  cd music_genre
30
  ```
31
 
32
  ## Results
33
- <img src="https://www.modelscope.cn/api/v1/models/ccmusic-database/music_genre/repo?Revision=master&FilePath=.%2Fgenres_results.png&View=true">
34
  A demo result of VGG19_BN fine-tuning:
35
  <style>
36
  #pianos td {
@@ -44,15 +44,15 @@ A demo result of VGG19_BN fine-tuning:
44
  <table id="pianos">
45
  <tr>
46
  <th>Loss curve</th>
47
- <td><img src="https://www.modelscope.cn/api/v1/models/ccmusic-database/music_genre/repo?Revision=master&FilePath=.%2Fvgg19_bn_cqt%2Floss.jpg&View=true"></td>
48
  </tr>
49
  <tr>
50
  <th>Training and validation accuracy</th>
51
- <td><img src="https://www.modelscope.cn/api/v1/models/ccmusic-database/music_genre/repo?Revision=master&FilePath=.%2Fvgg19_bn_cqt%2Facc.jpg&View=true"></td>
52
  </tr>
53
  <tr>
54
  <th>Confusion matrix</th>
55
- <td><img src="https://www.modelscope.cn/api/v1/models/ccmusic-database/music_genre/repo?Revision=master&FilePath=.%2Fvgg19_bn_cqt%2Fmat.jpg&View=true"></td>
56
  </tr>
57
  </table>
58
 
@@ -63,17 +63,4 @@ A demo result of VGG19_BN fine-tuning:
63
  <https://www.modelscope.cn/models/ccmusic-database/music_genre>
64
 
65
  ## Evaluation
66
- <https://github.com/monetjoe/ccmusic_eval>
67
-
68
- ## Cite
69
- ```bibtex
70
- @dataset{zhaorui_liu_2021_5676893,
71
- author = {Monan Zhou, Shenyang Xu, Zhaorui Liu, Zhaowen Wang, Feng Yu, Wei Li and Baoqiang Han},
72
- title = {CCMusic: an Open and Diverse Database for Chinese and General Music Information Retrieval Research},
73
- month = {mar},
74
- year = {2024},
75
- publisher = {HuggingFace},
76
- version = {1.2},
77
- url = {https://huggingface.co/ccmusic-database}
78
- }
79
- ```
 
15
  The music genre classification model is fine-tuned based on a pre-trained model from the computer vision (CV) domain, aiming to classify audio data into different genres. During the pre-training phase, the model learns rich feature representations using a large-scale dataset from computer vision tasks. Through transfer learning, these learned features are applied to the music genre classification task to enhance the model's performance on audio data. In the fine-tuning phase, an audio dataset containing 16 music genre categories is utilized. These audio samples are first transformed into spectrograms, converting the temporal audio signal into a two-dimensional representation in the time and frequency dimensions. The spectrogram representation captures the temporal evolution of different audio frequencies, providing the model with rich information about the audio content. Through fine-tuning, adjustments are made to the pre-trained model to meet the requirements of the music genre classification task. The model learns to extract features from spectrograms that are relevant to music genre, enabling accurate classification of audio samples. This process enables the model to recognize and infer music genres, such as rock, classical, pop, among others. By combining a pre-trained model from the computer vision domain with an audio task, this approach leverages cross-modal knowledge transfer, demonstrating the adaptability and effectiveness of pre-trained models across different domains.
16
 
17
  ## Demo
18
+ <https://huggingface.co/spaces/ccmusic-database/music_genre>
19
 
20
  ## Usage
21
  ```python
22
  from modelscope import snapshot_download
23
+ model_dir = snapshot_download("ccmusic-database/music_genre")
24
  ```
25
 
26
  ## Maintenance
27
  ```bash
28
+ git clone [email protected]:ccmusic-database/music_genre
29
  cd music_genre
30
  ```
31
 
32
  ## Results
33
+ ![](https://www.modelscope.cn/models/ccmusic-database/music_genre/resolve/master/genres_results.png)
34
  A demo result of VGG19_BN fine-tuning:
35
  <style>
36
  #pianos td {
 
44
  <table id="pianos">
45
  <tr>
46
  <th>Loss curve</th>
47
+ <td><img src="https://www.modelscope.cn/models/ccmusic-database/music_genre/resolve/master/vgg19_bn_cqt/loss.jpg"></td>
48
  </tr>
49
  <tr>
50
  <th>Training and validation accuracy</th>
51
+ <td><img src="https://www.modelscope.cn/models/ccmusic-database/music_genre/resolve/master/vgg19_bn_cqt/acc.jpg"></td>
52
  </tr>
53
  <tr>
54
  <th>Confusion matrix</th>
55
+ <td><img src="https://www.modelscope.cn/models/ccmusic-database/music_genre/resolve/master/vgg19_bn_cqt/mat.jpg"></td>
56
  </tr>
57
  </table>
58
 
 
63
  <https://www.modelscope.cn/models/ccmusic-database/music_genre>
64
 
65
  ## Evaluation
66
+ <https://github.com/monetjoe/ccmusic_eval>