tilyupo commited on
Commit
240aeb7
1 Parent(s): 7e14f98

batch_size=64

Browse files
Files changed (2) hide show
  1. README.md +10 -19
  2. tf_model.h5 +2 -2
README.md CHANGED
@@ -15,20 +15,9 @@ probably proofread and complete it, then remove this comment. -->
15
 
16
  This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
- - Train Loss: 1.1645
19
- - Validation Loss: 1.4135
20
- - Epoch: 2
21
-
22
- <pre>{'eval_loss': 1.4102883338928223,
23
- 'eval_bleu': 17.300335685770165,
24
- 'eval_rouge1': 53.45,
25
- 'eval_rouge2': 30.12,
26
- 'eval_rougeL': 46.5,
27
- 'eval_rougeLsum': 46.5,
28
- 'eval_exact': 0.018948595860460597,
29
- 'eval_runtime': 230.3707,
30
- 'eval_samples_per_second': 44.671,
31
- 'eval_steps_per_second': 1.398}</pre>
32
 
33
  ## Model description
34
 
@@ -47,16 +36,18 @@ More information needed
47
  ### Training hyperparameters
48
 
49
  The following hyperparameters were used during training:
50
- - optimizer: {'name': 'Adafactor', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 0.001, 'beta_2_decay': -0.8, 'epsilon_1': 1e-30, 'epsilon_2': 0.001, 'clip_threshold': 1.0, 'relative_step': False}
51
- - training_precision: float32
52
 
53
  ### Training results
54
 
55
  | Train Loss | Validation Loss | Epoch |
56
  |:----------:|:---------------:|:-----:|
57
- | 1.6697 | 1.4399 | 0 |
58
- | 1.3467 | 1.4110 | 1 |
59
- | 1.1645 | 1.4135 | 2 |
 
 
60
 
61
 
62
  ### Framework versions
 
15
 
16
  This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Train Loss: 1.1217
19
+ - Validation Loss: 1.4060
20
+ - Epoch: 4
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ## Model description
23
 
 
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
+ - optimizer: {'name': 'Adafactor', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 0.0005, 'beta_2_decay': -0.8, 'epsilon_1': 1e-30, 'epsilon_2': 0.001, 'clip_threshold': 1.0, 'relative_step': False}
40
+ - training_precision: mixed_bfloat16
41
 
42
  ### Training results
43
 
44
  | Train Loss | Validation Loss | Epoch |
45
  |:----------:|:---------------:|:-----:|
46
+ | 1.7310 | 1.4647 | 0 |
47
+ | 1.4658 | 1.4132 | 1 |
48
+ | 1.3222 | 1.3984 | 2 |
49
+ | 1.2141 | 1.3932 | 3 |
50
+ | 1.1217 | 1.4060 | 4 |
51
 
52
 
53
  ### Framework versions
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:48dc92da26420a855de0e5c705d645c305317bcd89fc9bf432236d41b81228f2
3
- size 439831352
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ebbb258db0cb208a82a96fe80a6295202e7f40bd82131663d2c0399a4e03ef60
3
+ size 439835448