beltrewilton commited on
Commit
5fc7741
·
1 Parent(s): 79f2bd8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -37,7 +37,9 @@ It achieves the following results on the evaluation set:
37
 
38
  ## Model description
39
 
40
- More information needed
 
 
41
 
42
  ## Intended uses & limitations
43
 
@@ -61,6 +63,13 @@ The following hyperparameters were used during training:
61
  - lr_scheduler_warmup_ratio: 0.1
62
  - num_epochs: 10
63
  - mixed_precision_training: Native AMP
 
 
 
 
 
 
 
64
 
65
  ### Training results
66
 
 
37
 
38
  ## Model description
39
 
40
+ This model was generated as part of the HF Audio course, I enjoyed it and currently this architecture achieves an amazing accuracy of 0.9 on the audio classification task.
41
+
42
+ The Audio Spectrogram Transformer is equivalent to [ViT](https://huggingface.co/docs/transformers/model_doc/vit), but applied on audio. Audio is first turned into an image (as a spectrogram), after which a Vision Transformer is applied. The model gets state-of-the-art results on several audio classification benchmarks.
43
 
44
  ## Intended uses & limitations
45
 
 
63
  - lr_scheduler_warmup_ratio: 0.1
64
  - num_epochs: 10
65
  - mixed_precision_training: Native AMP
66
+ - global_step: 2250
67
+ - training_loss: 0.23970948094350752
68
+ - train_runtime: 1982.7909
69
+ - train_samples_per_second: 4.534
70
+ - train_steps_per_second: 1.135
71
+ - total_flos: 6.094112254328832e+17
72
+ - train_loss: 0.23970948094350752
73
 
74
  ### Training results
75