metadata

license: apache-2.0
base_model: facebook/bart-large
tags:
  - text2text-generation
  - generated_from_trainer
metrics:
  - sacrebleu
model-index:
  - name: model_v2
    results: []

model_v2

This model is a fine-tuned version of facebook/bart-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.2418
Sacrebleu: 66.7409

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Sacrebleu
No log	1.0	218	0.6656	66.6707
No log	2.0	437	0.5851	66.5767
No log	3.0	656	0.6062	66.4734
No log	4.0	875	0.7029	66.5944
No log	5.0	1093	0.6852	66.0086
No log	6.0	1312	0.7471	66.0534
No log	7.0	1531	0.8938	66.1986
No log	8.0	1750	0.8834	66.4626
No log	9.0	1968	0.8895	66.4292
No log	10.0	2187	0.8824	66.0577
No log	11.0	2406	0.8781	66.5076
No log	12.0	2625	0.9870	66.5564
No log	13.0	2843	1.1580	66.5116
No log	14.0	3062	0.9797	66.3801
No log	15.0	3281	1.0680	66.2748
No log	16.0	3500	1.0113	66.5282
No log	17.0	3718	1.0023	66.5794
No log	18.0	3937	1.0753	66.2935
No log	19.0	4156	1.0462	66.5036
No log	20.0	4375	1.0934	66.7931
No log	21.0	4593	1.1732	66.5171
No log	22.0	4812	1.1892	66.4821
No log	23.0	5031	1.2766	66.5913
No log	24.0	5250	1.2392	66.5476
No log	25.0	5468	1.3452	66.5616
No log	26.0	5687	1.1427	66.7916
No log	27.0	5906	1.1809	66.9823
No log	28.0	6125	1.2310	66.7958
No log	29.0	6343	1.2147	66.7948
No log	29.9	6540	1.2418	66.7409

Framework versions

Transformers 4.39.3
Pytorch 2.1.2
Datasets 2.18.0
Tokenizers 0.15.2