t5Sunda2Indo / README.md
pijarcandra22's picture
Training in progress epoch 140
7ea2cfa
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_keras_callback
model-index:
  - name: pijarcandra22/t5Sunda2Indo
    results: []

pijarcandra22/t5Sunda2Indo

This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 1.6406
  • Validation Loss: 1.5932
  • Epoch: 140

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
3.9668 3.4054 0
3.5451 3.1460 1
3.3386 2.9773 2
3.1917 2.8549 3
3.0808 2.7568 4
2.9921 2.6780 5
2.9187 2.6135 6
2.8536 2.5547 7
2.8010 2.5040 8
2.7482 2.4601 9
2.7056 2.4227 10
2.6709 2.3870 11
2.6325 2.3554 12
2.6016 2.3233 13
2.5685 2.2965 14
2.5407 2.2710 15
2.5158 2.2486 16
2.4905 2.2248 17
2.4680 2.2037 18
2.4459 2.1856 19
2.4257 2.1684 20
2.4045 2.1495 21
2.3861 2.1325 22
2.3676 2.1189 23
2.3479 2.1028 24
2.3333 2.0873 25
2.3192 2.0743 26
2.3035 2.0647 27
2.2889 2.0504 28
2.2742 2.0395 29
2.2608 2.0285 30
2.2471 2.0166 31
2.2344 2.0078 32
2.2211 1.9999 33
2.2086 1.9893 34
2.1965 1.9790 35
2.1879 1.9724 36
2.1774 1.9637 37
2.1663 1.9537 38
2.1573 1.9461 39
2.1470 1.9389 40
2.1344 1.9329 41
2.1259 1.9257 42
2.1192 1.9158 43
2.1098 1.9092 44
2.0993 1.9021 45
2.0930 1.8970 46
2.0831 1.8909 47
2.0729 1.8845 48
2.0669 1.8799 49
2.0587 1.8746 50
2.0519 1.8662 51
2.0446 1.8605 52
2.0338 1.8552 53
2.0297 1.8494 54
2.0207 1.8441 55
2.0151 1.8404 56
2.0116 1.8346 57
2.0029 1.8286 58
1.9942 1.8243 59
1.9894 1.8177 60
1.9820 1.8145 61
1.9753 1.8100 62
1.9698 1.8054 63
1.9635 1.8001 64
1.9588 1.7963 65
1.9533 1.7895 66
1.9443 1.7888 67
1.9381 1.7846 68
1.9320 1.7806 69
1.9281 1.7755 70
1.9232 1.7697 71
1.9141 1.7672 72
1.9128 1.7655 73
1.9041 1.7611 74
1.8987 1.7558 75
1.8955 1.7498 76
1.8874 1.7493 77
1.8845 1.7433 78
1.8793 1.7403 79
1.8723 1.7370 80
1.8669 1.7319 81
1.8626 1.7323 82
1.8593 1.7268 83
1.8540 1.7235 84
1.8504 1.7204 85
1.8468 1.7180 86
1.8398 1.7130 87
1.8358 1.7088 88
1.8321 1.7081 89
1.8271 1.7042 90
1.8203 1.7016 91
1.8189 1.6985 92
1.8117 1.6967 93
1.8089 1.6924 94
1.8035 1.6898 95
1.7984 1.6904 96
1.7957 1.6836 97
1.7925 1.6833 98
1.7880 1.6786 99
1.7816 1.6770 100
1.7772 1.6739 101
1.7754 1.6733 102
1.7712 1.6678 103
1.7677 1.6638 104
1.7641 1.6627 105
1.7601 1.6609 106
1.7573 1.6585 107
1.7516 1.6559 108
1.7478 1.6523 109
1.7443 1.6523 110
1.7406 1.6498 111
1.7361 1.6475 112
1.7326 1.6435 113
1.7285 1.6422 114
1.7244 1.6398 115
1.7205 1.6386 116
1.7187 1.6347 117
1.7157 1.6335 118
1.7113 1.6317 119
1.7094 1.6308 120
1.7074 1.6267 121
1.7007 1.6252 122
1.6965 1.6241 123
1.6931 1.6231 124
1.6913 1.6201 125
1.6862 1.6174 126
1.6833 1.6176 127
1.6827 1.6122 128
1.6774 1.6127 129
1.6726 1.6119 130
1.6698 1.6079 131
1.6663 1.6077 132
1.6631 1.6055 133
1.6617 1.6043 134
1.6573 1.6019 135
1.6549 1.5994 136
1.6514 1.5990 137
1.6484 1.5965 138
1.6490 1.5942 139
1.6406 1.5932 140

Framework versions

  • Transformers 4.35.2
  • TensorFlow 2.14.0
  • Datasets 2.15.0
  • Tokenizers 0.15.0