sayakpaul's picture
sayakpaul HF staff
Training in progress epoch 76
10cb2bb
|
raw
history blame
7.89 kB
metadata
license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: tf-tpu/roberta-base-epochs-100
    results: []

tf-tpu/roberta-base-epochs-100

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 1.0821
  • Train Accuracy: 0.1125
  • Validation Loss: 1.0229
  • Validation Accuracy: 0.1148
  • Epoch: 76

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 55765, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 2935, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
  • training_precision: mixed_bfloat16

Training results

Train Loss Train Accuracy Validation Loss Validation Accuracy Epoch
7.2121 0.0274 5.7188 0.0346 0
5.4335 0.0414 5.2266 0.0439 1
5.1579 0.0445 5.0625 0.0441 2
5.0231 0.0447 4.9453 0.0446 3
4.9323 0.0448 4.8633 0.0443 4
4.8672 0.0449 4.8789 0.0440 5
4.8200 0.0449 4.8164 0.0441 6
4.7841 0.0449 4.7734 0.0450 7
4.7546 0.0449 4.7539 0.0441 8
4.7288 0.0449 4.7305 0.0447 9
4.7084 0.0449 4.7422 0.0443 10
4.6884 0.0450 4.7148 0.0437 11
4.6764 0.0449 4.7070 0.0441 12
4.6637 0.0449 4.7227 0.0435 13
4.5963 0.0449 4.5195 0.0444 14
4.3462 0.0468 4.0742 0.0515 15
3.4139 0.0650 2.6348 0.0797 16
2.5336 0.0817 2.1816 0.0888 17
2.1859 0.0888 1.9648 0.0930 18
2.0043 0.0925 1.8154 0.0961 19
1.8887 0.0948 1.7129 0.0993 20
1.8058 0.0965 1.6729 0.0996 21
1.7402 0.0979 1.6191 0.1010 22
1.6861 0.0990 1.5693 0.1024 23
1.6327 0.1001 1.5273 0.1035 24
1.5906 0.1010 1.4766 0.1042 25
1.5545 0.1018 1.4561 0.1031 26
1.5231 0.1024 1.4365 0.1054 27
1.4957 0.1030 1.3975 0.1046 28
1.4700 0.1036 1.3789 0.1061 29
1.4466 0.1041 1.3262 0.1070 30
1.4253 0.1046 1.3223 0.1072 31
1.4059 0.1050 1.3096 0.1070 32
1.3873 0.1054 1.3164 0.1072 33
1.3703 0.1058 1.2861 0.1072 34
1.3550 0.1062 1.2705 0.1082 35
1.3398 0.1065 1.2578 0.1082 36
1.3260 0.1068 1.25 0.1096 37
1.3127 0.1071 1.2266 0.1102 38
1.2996 0.1074 1.2305 0.1098 39
1.2891 0.1077 1.2139 0.1088 40
1.2783 0.1079 1.2158 0.1093 41
1.2674 0.1081 1.1787 0.1114 42
1.2570 0.1084 1.1709 0.1107 43
1.2478 0.1086 1.1709 0.1104 44
1.2390 0.1088 1.1777 0.1101 45
1.2305 0.1090 1.1738 0.1111 46
1.2215 0.1092 1.1533 0.1112 47
1.2140 0.1094 1.1514 0.1117 48
1.2068 0.1096 1.1621 0.1119 49
1.1991 0.1097 1.1416 0.1108 50
1.1927 0.1099 1.1279 0.1113 51
1.1854 0.1101 1.1147 0.1123 52
1.1800 0.1102 1.125 0.1116 53
1.1727 0.1104 1.1167 0.1116 54
1.1679 0.1105 1.0884 0.1122 55
1.1613 0.1106 1.1084 0.1120 56
1.1563 0.1107 1.1035 0.1119 57
1.1517 0.1109 1.1035 0.1124 58
1.1454 0.1111 1.0718 0.1128 59
1.1403 0.1111 1.0874 0.1123 60
1.1360 0.1112 1.0742 0.1145 61
1.1318 0.1114 1.0811 0.1131 62
1.1277 0.1114 1.0723 0.1129 63
1.1226 0.1116 1.0640 0.1124 64
1.1186 0.1117 1.0840 0.1117 65
1.1144 0.1118 1.0522 0.1139 66
1.1111 0.1119 1.0557 0.1132 67
1.1069 0.1119 1.0718 0.1124 68
1.1038 0.1120 1.0376 0.1135 69
1.1007 0.1121 1.0537 0.1138 70
1.0975 0.1121 1.0503 0.1134 71
1.0941 0.1122 1.0317 0.1140 72
1.0902 0.1124 1.0439 0.1145 73
1.0881 0.1124 1.0352 0.1145 74
1.0839 0.1125 1.0449 0.1144 75
1.0821 0.1125 1.0229 0.1148 76

Framework versions

  • Transformers 4.27.0.dev0
  • TensorFlow 2.9.1
  • Tokenizers 0.13.2