metadata

license: mit
tags:
  - generated_from_keras_callback
model-index:
  - name: tf-tpu/roberta-base-epochs-100
    results: []

tf-tpu/roberta-base-epochs-100

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0001, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0001, 'decay_steps': 55765, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 2935, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.001}
training_precision: mixed_bfloat16

Train Loss	Train Accuracy	Validation Loss	Validation Accuracy	Epoch
7.2121	0.0274	5.7188	0.0346	0
5.4335	0.0414	5.2266	0.0439	1
5.1579	0.0445	5.0625	0.0441	2
5.0231	0.0447	4.9453	0.0446	3
4.9323	0.0448	4.8633	0.0443	4
4.8672	0.0449	4.8789	0.0440	5
4.8200	0.0449	4.8164	0.0441	6
4.7841	0.0449	4.7734	0.0450	7
4.7546	0.0449	4.7539	0.0441	8
4.7288	0.0449	4.7305	0.0447	9
4.7084	0.0449	4.7422	0.0443	10
4.6884	0.0450	4.7148	0.0437	11
4.6764	0.0449	4.7070	0.0441	12
4.6637	0.0449	4.7227	0.0435	13
4.5963	0.0449	4.5195	0.0444	14