metadata

license: apache-2.0
base_model: facebook/bart-large
tags:
  - text2text-generation
  - generated_from_trainer
metrics:
  - sacrebleu
model-index:
  - name: model_v3
    results: []

model_v3

This model is a fine-tuned version of facebook/bart-large on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.0664
Sacrebleu: 66.6476

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Sacrebleu
No log	1.0	218	0.5750	66.6584
No log	2.0	437	0.5581	66.9419
No log	3.0	656	0.5662	66.8166
No log	4.0	875	0.6339	66.8911
No log	5.0	1093	0.6190	66.4260
No log	6.0	1312	0.6760	66.7698
No log	7.0	1531	0.6708	66.7328
No log	8.0	1750	0.7686	66.6153
No log	9.0	1968	0.7157	66.7670
No log	10.0	2187	0.7567	66.6510
No log	11.0	2406	0.7699	66.5710
No log	12.0	2625	0.8145	66.7658
No log	13.0	2843	0.8292	66.4557
No log	14.0	3062	0.8610	66.7477
No log	15.0	3281	0.8962	66.4487
No log	16.0	3500	0.9000	66.6798
No log	17.0	3718	0.9376	66.5672
No log	18.0	3937	0.8907	66.6538
No log	19.0	4156	0.8829	66.5278
No log	20.0	4375	0.9925	66.5495
No log	21.0	4593	0.9656	66.5410
No log	22.0	4812	0.9721	66.4741
No log	23.0	5031	0.9778	66.6736
No log	24.0	5250	1.0032	66.5801
No log	25.0	5468	1.0808	66.6122
No log	26.0	5687	1.0403	66.7292
No log	27.0	5906	1.0388	66.5946
No log	28.0	6125	1.0707	66.6240
No log	29.0	6343	1.0356	66.7184
No log	29.9	6540	1.0664	66.6476

Framework versions

Transformers 4.39.3
Pytorch 2.1.2
Datasets 2.18.0
Tokenizers 0.15.2