tilyupo
/

t5-small-trivia-gpu-ca2q

Text2Text Generation

generated_from_keras_callback

Inference Endpoints

Model card Files Files and versions Community

tilyupo commited on Aug 7, 2023

Commit

240aeb7

•

1 Parent(s): 7e14f98

batch_size=64

Files changed (2) hide show

README.md +10 -19
tf_model.h5 +2 -2

README.md CHANGED Viewed

@@ -15,20 +15,9 @@ probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Train Loss: 1.1645
-- Validation Loss: 1.4135
-- Epoch: 2
-<pre>{'eval_loss': 1.4102883338928223,
- 'eval_bleu': 17.300335685770165,
- 'eval_rouge1': 53.45,
- 'eval_rouge2': 30.12,
- 'eval_rougeL': 46.5,
- 'eval_rougeLsum': 46.5,
- 'eval_exact': 0.018948595860460597,
- 'eval_runtime': 230.3707,
- 'eval_samples_per_second': 44.671,
- 'eval_steps_per_second': 1.398}</pre>
 ## Model description
@@ -47,16 +36,18 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- optimizer: {'name': 'Adafactor', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 0.001, 'beta_2_decay': -0.8, 'epsilon_1': 1e-30, 'epsilon_2': 0.001, 'clip_threshold': 1.0, 'relative_step': False}
-- training_precision: float32
 ### Training results
 | Train Loss | Validation Loss | Epoch |
 |:----------:|:---------------:|:-----:|
-| 1.6697     | 1.4399          | 0     |
-| 1.3467     | 1.4110          | 1     |
-| 1.1645     | 1.4135          | 2     |
 ### Framework versions

 This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Train Loss: 1.1217
+- Validation Loss: 1.4060
+- Epoch: 4
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- optimizer: {'name': 'Adafactor', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 0.0005, 'beta_2_decay': -0.8, 'epsilon_1': 1e-30, 'epsilon_2': 0.001, 'clip_threshold': 1.0, 'relative_step': False}
+- training_precision: mixed_bfloat16
 ### Training results
 | Train Loss | Validation Loss | Epoch |
 |:----------:|:---------------:|:-----:|
+| 1.7310     | 1.4647          | 0     |
+| 1.4658     | 1.4132          | 1     |
+| 1.3222     | 1.3984          | 2     |
+| 1.2141     | 1.3932          | 3     |
+| 1.1217     | 1.4060          | 4     |
 ### Framework versions

tf_model.h5 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:48dc92da26420a855de0e5c705d645c305317bcd89fc9bf432236d41b81228f2
-size 439831352

 version https://git-lfs.github.com/spec/v1
+oid sha256:ebbb258db0cb208a82a96fe80a6295202e7f40bd82131663d2c0399a4e03ef60
+size 439835448