joeyMartig commited on
Commit
db3e9a4
1 Parent(s): 14e5818

Added model description

Browse files
Files changed (1) hide show
  1. README.md +27 -15
README.md CHANGED
@@ -24,15 +24,27 @@ It achieves the following results on the evaluation set:
24
 
25
  ## Model description
26
 
27
- More information needed
 
 
28
 
29
  ## Intended uses & limitations
30
 
31
- More information needed
 
 
 
 
 
 
 
 
 
32
 
33
  ## Training and evaluation data
34
 
35
- More information needed
 
36
 
37
  ## Training procedure
38
 
@@ -51,18 +63,18 @@ The following hyperparameters were used during training:
51
 
52
  ### Training results
53
 
54
- | Training Loss | Epoch | Step | Validation Loss | Wer |
55
- |:-------------:|:-----:|:----:|:---------------:|:-------:|
56
- | No log | 1.0 | 7 | 0.9825 | 38.1513 |
57
- | No log | 2.0 | 14 | 0.7112 | 35.7143 |
58
- | No log | 3.0 | 21 | 0.4668 | 68.2353 |
59
- | No log | 4.0 | 28 | 0.2396 | 33.6134 |
60
- | No log | 5.0 | 35 | 0.1178 | 33.4454 |
61
- | No log | 6.0 | 42 | 0.0526 | 33.4454 |
62
- | No log | 7.0 | 49 | 0.0317 | 33.4454 |
63
- | No log | 8.0 | 56 | 0.0165 | 33.4454 |
64
- | No log | 9.0 | 63 | 0.0133 | 33.4454 |
65
- | No log | 10.0 | 70 | 0.0117 | 33.4454 |
66
 
67
 
68
  ### Framework versions
 
24
 
25
  ## Model description
26
 
27
+ The model is a fine-tuned version of OpenAI's Whisper, specifically trained to recognize and transcribe specialized vocabulary from the Dungeons & Dragons (D&D) universe.
28
+ This fine-tuning process involves retraining an existing Whisper model on a custom dataset composed of audio samples containing D&D-specific terms, which were not adequately recognized by the original model.
29
+ The goal is to enhance the model's ability to accurately transcribe D&D terminology, which includes unique names of monsters, characters, and places, making it a more effective tool for users engaging with content related to D&D.
30
 
31
  ## Intended uses & limitations
32
 
33
+ ### Intended Uses:
34
+
35
+ - The model is intended for use in scenarios where accurate transcription of specialized D&D vocabulary is crucial. This includes applications such as automatic transcription of game sessions, creation of subtitles for D&D-related content, or assisting in the documentation of in-game narratives.
36
+ - The model is particularly useful for users who frequently encounter or work with D&D-specific language that standard transcription models might struggle to accurately transcribe.
37
+
38
+ ### Limitations:
39
+
40
+ - The model's performance is constrained by the size and diversity of the training dataset. Since the dataset used was relatively small and focused, the model might not perform well on a broader range of accents, voice types, or D&D-specific terms that were not included in the training set.
41
+ - The model requires significant computational resources for training and fine-tuning. While it shows improvements over the base model, these gains are achieved at the cost of extended processing times and the need for powerful hardware, such as GPUs available on HPC clusters.
42
+ - Due to the limitations in data, the model may still produce errors or inconsistent results, especially when encountering terms or phrases outside the scope of the fine-tuning dataset.
43
 
44
  ## Training and evaluation data
45
 
46
+ The training data consisted of 136 initial audio samples derived from a vocabulary of 34 D&D-specific words, with each word incorporated into two different sentences.
47
+ To expand this limited dataset, audio filters were applied to the samples to artificially increase their variety, resulting in a fivefold increase in the number of training examples, reaching a total of 680 samples.
48
 
49
  ## Training procedure
50
 
 
63
 
64
  ### Training results
65
 
66
+ | Epoch | Step | Validation Loss | Wer |
67
+ |:-----:|:----:|:---------------:|:-------:|
68
+ | 1.0 | 7 | 0.9825 | 38.1513 |
69
+ | 2.0 | 14 | 0.7112 | 35.7143 |
70
+ | 3.0 | 21 | 0.4668 | 68.2353 |
71
+ | 4.0 | 28 | 0.2396 | 33.6134 |
72
+ | 5.0 | 35 | 0.1178 | 33.4454 |
73
+ | 6.0 | 42 | 0.0526 | 33.4454 |
74
+ | 7.0 | 49 | 0.0317 | 33.4454 |
75
+ | 8.0 | 56 | 0.0165 | 33.4454 |
76
+ | 9.0 | 63 | 0.0133 | 33.4454 |
77
+ | 10.0 | 70 | 0.0117 | 33.4454 |
78
 
79
 
80
  ### Framework versions