metadata

license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: tf-tpu/roberta-base-epochs-100
    results: []

tf-tpu/roberta-base-epochs-100

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 55765, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 2935, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
training_precision: mixed_bfloat16

Train Loss	Train Accuracy	Validation Loss	Validation Accuracy	Epoch
7.2121	0.0274	5.7188	0.0346	0
5.4335	0.0414	5.2266	0.0439	1
5.1579	0.0445	5.0625	0.0441	2
5.0231	0.0447	4.9453	0.0446	3
4.9323	0.0448	4.8633	0.0443	4
4.8672	0.0449	4.8789	0.0440	5
4.8200	0.0449	4.8164	0.0441	6
4.7841	0.0449	4.7734	0.0450	7
4.7546	0.0449	4.7539	0.0441	8
4.7288	0.0449	4.7305	0.0447	9
4.7084	0.0449	4.7422	0.0443	10
4.6884	0.0450	4.7148	0.0437	11
4.6764	0.0449	4.7070	0.0441	12
4.6637	0.0449	4.7227	0.0435	13
4.5963	0.0449	4.5195	0.0444	14
4.3462	0.0468	4.0742	0.0515	15
3.4139	0.0650	2.6348	0.0797	16
2.5336	0.0817	2.1816	0.0888	17
2.1859	0.0888	1.9648	0.0930	18
2.0043	0.0925	1.8154	0.0961	19