Yova commited on
Commit
1f18612
1 Parent(s): a302eb1

End of training

Browse files
Files changed (2) hide show
  1. README.md +8 -28
  2. generation_config.json +5 -3
README.md CHANGED
@@ -13,8 +13,8 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
- - Loss: 1.2583
17
- - Exact Match: 0.0
18
 
19
  ## Model description
20
 
@@ -34,40 +34,20 @@ More information needed
34
 
35
  The following hyperparameters were used during training:
36
  - learning_rate: 0.001
37
- - train_batch_size: 100
38
  - eval_batch_size: 8
39
  - seed: 42
40
- - gradient_accumulation_steps: 4
41
- - total_train_batch_size: 400
42
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
  - lr_scheduler_type: inverse_sqrt
44
  - lr_scheduler_warmup_steps: 4000
45
- - num_epochs: 20
 
46
 
47
  ### Training results
48
 
49
- | Training Loss | Epoch | Step | Validation Loss | Exact Match |
50
- |:-------------:|:-----:|:----:|:---------------:|:-----------:|
51
- | 6.6009 | 1.0 | 25 | 5.6681 | 0.0 |
52
- | 5.9582 | 2.0 | 50 | 4.4844 | 0.0 |
53
- | 5.0101 | 3.0 | 75 | 3.7434 | 0.0 |
54
- | 4.3172 | 4.0 | 100 | 3.4055 | 0.0 |
55
- | 3.9672 | 5.0 | 125 | 3.2224 | 0.0 |
56
- | 3.7659 | 6.0 | 150 | 3.0110 | 0.0 |
57
- | 3.5416 | 7.0 | 175 | 2.7741 | 0.0 |
58
- | 3.3002 | 8.0 | 200 | 2.4967 | 0.0 |
59
- | 3.0412 | 9.0 | 225 | 2.2354 | 0.0 |
60
- | 2.8117 | 10.0 | 250 | 2.1391 | 0.0 |
61
- | 2.6168 | 11.0 | 275 | 2.0822 | 0.0 |
62
- | 2.4624 | 12.0 | 300 | 2.0147 | 0.0 |
63
- | 2.3253 | 13.0 | 325 | 1.9378 | 0.0 |
64
- | 2.2152 | 14.0 | 350 | 1.8335 | 0.0 |
65
- | 2.11 | 15.0 | 375 | 1.7586 | 0.0 |
66
- | 2.0029 | 16.0 | 400 | 1.6847 | 0.0 |
67
- | 1.9103 | 17.0 | 425 | 1.5874 | 0.0 |
68
- | 1.8166 | 18.0 | 450 | 1.5552 | 0.0 |
69
- | 1.7264 | 19.0 | 475 | 1.3748 | 0.0 |
70
- | 1.6562 | 20.0 | 500 | 1.2583 | 0.0 |
71
 
72
 
73
  ### Framework versions
 
13
 
14
  This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
+ - Loss: 1.2427
17
+ - Exact Match: 0.31
18
 
19
  ## Model description
20
 
 
34
 
35
  The following hyperparameters were used during training:
36
  - learning_rate: 0.001
37
+ - train_batch_size: 400
38
  - eval_batch_size: 8
39
  - seed: 42
 
 
40
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
41
  - lr_scheduler_type: inverse_sqrt
42
  - lr_scheduler_warmup_steps: 4000
43
+ - training_steps: 400
44
+ - label_smoothing_factor: 0.1
45
 
46
  ### Training results
47
 
48
+ | Training Loss | Epoch | Step | Validation Loss | Exact Match |
49
+ |:-------------:|:------:|:----:|:---------------:|:-----------:|
50
+ | 2.9226 | 133.33 | 400 | 1.2427 | 0.31 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
 
53
  ### Framework versions
generation_config.json CHANGED
@@ -1,7 +1,9 @@
1
  {
2
- "decoder_start_token_id": 259,
3
- "eos_token_id": 1,
4
- "max_new_tokens": 20,
 
 
5
  "num_beams": 5,
6
  "pad_token_id": 0,
7
  "transformers_version": "4.35.2"
 
1
  {
2
+ "bos_token_id": 1,
3
+ "decoder_start_token_id": 1,
4
+ "early_stopping": true,
5
+ "eos_token_id": 2,
6
+ "max_new_tokens": 128,
7
  "num_beams": 5,
8
  "pad_token_id": 0,
9
  "transformers_version": "4.35.2"