metadata

license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: tf-tpu/roberta-base-epochs-100
    results: []

tf-tpu/roberta-base-epochs-100

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 55765, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 2935, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
training_precision: mixed_bfloat16

Train Loss	Train Accuracy	Validation Loss	Validation Accuracy	Epoch
7.2121	0.0274	5.7188	0.0346	0
5.4335	0.0414	5.2266	0.0439	1
5.1579	0.0445	5.0625	0.0441	2
5.0231	0.0447	4.9453	0.0446	3
4.9323	0.0448	4.8633	0.0443	4
4.8672	0.0449	4.8789	0.0440	5
4.8200	0.0449	4.8164	0.0441	6
4.7841	0.0449	4.7734	0.0450	7
4.7546	0.0449	4.7539	0.0441	8
4.7288	0.0449	4.7305	0.0447	9
4.7084	0.0449	4.7422	0.0443	10
4.6884	0.0450	4.7148	0.0437	11
4.6764	0.0449	4.7070	0.0441	12
4.6637	0.0449	4.7227	0.0435	13
4.5963	0.0449	4.5195	0.0444	14
4.3462	0.0468	4.0742	0.0515	15
3.4139	0.0650	2.6348	0.0797	16
2.5336	0.0817	2.1816	0.0888	17
2.1859	0.0888	1.9648	0.0930	18
2.0043	0.0925	1.8154	0.0961	19
1.8887	0.0948	1.7129	0.0993	20
1.8058	0.0965	1.6729	0.0996	21
1.7402	0.0979	1.6191	0.1010	22
1.6861	0.0990	1.5693	0.1024	23
1.6327	0.1001	1.5273	0.1035	24
1.5906	0.1010	1.4766	0.1042	25
1.5545	0.1018	1.4561	0.1031	26
1.5231	0.1024	1.4365	0.1054	27
1.4957	0.1030	1.3975	0.1046	28
1.4700	0.1036	1.3789	0.1061	29
1.4466	0.1041	1.3262	0.1070	30
1.4253	0.1046	1.3223	0.1072	31
1.4059	0.1050	1.3096	0.1070	32
1.3873	0.1054	1.3164	0.1072	33
1.3703	0.1058	1.2861	0.1072	34
1.3550	0.1062	1.2705	0.1082	35
1.3398	0.1065	1.2578	0.1082	36
1.3260	0.1068	1.25	0.1096	37
1.3127	0.1071	1.2266	0.1102	38
1.2996	0.1074	1.2305	0.1098	39
1.2891	0.1077	1.2139	0.1088	40
1.2783	0.1079	1.2158	0.1093	41
1.2674	0.1081	1.1787	0.1114	42
1.2570	0.1084	1.1709	0.1107	43
1.2478	0.1086	1.1709	0.1104	44
1.2390	0.1088	1.1777	0.1101	45
1.2305	0.1090	1.1738	0.1111	46
1.2215	0.1092	1.1533	0.1112	47
1.2140	0.1094	1.1514	0.1117	48
1.2068	0.1096	1.1621	0.1119	49
1.1991	0.1097	1.1416	0.1108	50
1.1927	0.1099	1.1279	0.1113	51
1.1854	0.1101	1.1147	0.1123	52
1.1800	0.1102	1.125	0.1116	53
1.1727	0.1104	1.1167	0.1116	54