pszemraj's picture
Update README.md
ff6646d
|
raw
history blame
2.02 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: bart-base-code-instructiongen
    results: []
inference:
  parameters:
    max_length: 128
    num_beams: 4

bart-base-code-instructiongen

This model is a fine-tuned version of facebook/bart-base on the pszemraj/fleece2instructions-codealpaca dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0136
  • Rouge1: 59.9513
  • Rouge2: 33.9118
  • Rougel: 55.7815
  • Rougelsum: 56.9064
  • Gen Len: 29.7146

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

Refer to pszemraj/fleece2instructions-codealpaca

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.1165 1.0 281 1.1090 57.9239 31.9259 53.8737 54.9811 28.2924
1.0763 2.0 563 1.0267 59.9605 34.0298 55.7523 56.8021 29.6966
0.9595 2.99 843 1.0136 59.9513 33.9118 55.7815 56.9064 29.7146

Framework versions

  • Transformers 4.28.0.dev0
  • Pytorch 2.0.0.dev20230212+cu118
  • Datasets 2.9.0
  • Tokenizers 0.13.2