metadata

license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: bart-base-code-instructiongen
    results: []
inference:
  parameters:
    max_length: 128
    num_beams: 4

bart-base-code-instructiongen

This model is a fine-tuned version of facebook/bart-base on the pszemraj/fleece2instructions-codealpaca dataset. It achieves the following results on the evaluation set:

Loss: 1.0136
Rouge1: 59.9513
Rouge2: 33.9118
Rougel: 55.7815
Rougelsum: 56.9064
Gen Len: 29.7146

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

Refer to pszemraj/fleece2instructions-codealpaca

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
gradient_accumulation_steps: 16
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.02
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.1165	1.0	281	1.1090	57.9239	31.9259	53.8737	54.9811	28.2924
1.0763	2.0	563	1.0267	59.9605	34.0298	55.7523	56.8021	29.6966
0.9595	2.99	843	1.0136	59.9513	33.9118	55.7815	56.9064	29.7146

Framework versions

Transformers 4.28.0.dev0
Pytorch 2.0.0.dev20230212+cu118
Datasets 2.9.0
Tokenizers 0.13.2