Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=4248,
training_loss=2.930363613782405,
metrics={'train_runtime': 11857.8062,
'train_samples_per_second': 5.014,
'train_steps_per_second': 0.358,
'total_flos': 1.3114345819786445e+17,
'train_loss': 2.930363613782405,
'epoch': 3.0}
Training Results
Epoch | Training Loss | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bleu | Gen Len |
---|---|---|---|---|---|---|---|---|
1 | 3.095400 | 2.864138 | 0.425500 | 0.139000 | 0.246300 | 0.246300 | 0.541400 | 141.540900 |
2 | 2.876500 | 2.811244 | 0.425600 | 0.139100 | 0.246500 | 0.246400 | 0.541600 | 141.619000 |
3 | 2.748300 | 2.797923 | 0.425800 | 0.138700 | 0.246400 | 0.246300 | 0.541800 | 141.597000 |
- Downloads last month
- 17
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.