tilyupo commited on
Commit
3064deb
1 Parent(s): b0c5dd8

batch_size=4

Browse files
Files changed (2) hide show
  1. README.md +8 -20
  2. tf_model.h5 +1 -1
README.md CHANGED
@@ -15,20 +15,9 @@ probably proofread and complete it, then remove this comment. -->
15
 
16
  This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
- - Train Loss: 1.0929
19
- - Validation Loss: 1.4052
20
- - Epoch: 4
21
-
22
- <pre>{'eval_loss': 1.400876522064209,
23
- 'eval_bleu': 17.847721241494337,
24
- 'eval_rouge1': 54.52,
25
- 'eval_rouge2': 31.47,
26
- 'eval_rougeL': 47.68,
27
- 'eval_rougeLsum': 47.66,
28
- 'eval_exact': 0.021183558449130307,
29
- 'eval_runtime': 239.6854,
30
- 'eval_samples_per_second': 42.935,
31
- 'eval_steps_per_second': 1.343}</pre>
32
 
33
  ## Model description
34
 
@@ -47,18 +36,17 @@ More information needed
47
  ### Training hyperparameters
48
 
49
  The following hyperparameters were used during training:
50
- - optimizer: {'name': 'Adafactor', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 0.0002, 'beta_2_decay': -0.8, 'epsilon_1': 1e-30, 'epsilon_2': 0.001, 'clip_threshold': 1.0, 'relative_step': False}
51
  - training_precision: mixed_bfloat16
52
 
53
  ### Training results
54
 
55
  | Train Loss | Validation Loss | Epoch |
56
  |:----------:|:---------------:|:-----:|
57
- | 1.7107 | 1.4525 | 0 |
58
- | 1.4445 | 1.4003 | 1 |
59
- | 1.2991 | 1.3924 | 2 |
60
- | 1.1877 | 1.3867 | 3 |
61
- | 1.0929 | 1.4052 | 4 |
62
 
63
 
64
  ### Framework versions
 
15
 
16
  This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Train Loss: 1.2675
19
+ - Validation Loss: 1.3898
20
+ - Epoch: 3
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ## Model description
23
 
 
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
+ - optimizer: {'name': 'Adafactor', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 0.00014285714, 'beta_2_decay': -0.8, 'epsilon_1': 1e-30, 'epsilon_2': 0.001, 'clip_threshold': 1.0, 'relative_step': False}
40
  - training_precision: mixed_bfloat16
41
 
42
  ### Training results
43
 
44
  | Train Loss | Validation Loss | Epoch |
45
  |:----------:|:---------------:|:-----:|
46
+ | 1.7429 | 1.4649 | 0 |
47
+ | 1.4976 | 1.4196 | 1 |
48
+ | 1.3663 | 1.3913 | 2 |
49
+ | 1.2675 | 1.3898 | 3 |
 
50
 
51
 
52
  ### Framework versions
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:36abc1798475ecfc45790aaa0dd9174bf249afc6c39f8d24ac7c2d4de22bd087
3
  size 439831352
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0f1a28ef8ecc81b3f0ea1a957a5764ce3afef588c411c6026e86bc8c4fcdd477
3
  size 439831352