Edit model card

bart_samsum_v2

This model is a fine-tuned version of facebook/bart-large-cnn on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0236

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 8
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss
9.4233 0.17 1 9.1990
9.5213 0.34 2 8.5394
8.7467 0.52 3 8.1115
8.4697 0.69 4 7.5747
7.752 0.86 5 6.8712
7.0515 1.03 6 5.8670
6.0874 1.2 7 4.6814
5.0408 1.38 8 3.8055
4.14 1.55 9 2.6678
2.9893 1.72 10 1.9701
2.4337 1.89 11 1.5191
1.9451 2.06 12 1.2105
1.53 2.24 13 0.9714
1.2369 2.41 14 0.7905
1.0014 2.58 15 0.6478
0.8419 2.75 16 0.5493
0.7338 2.92 17 0.4770
0.6393 3.1 18 0.4151
0.5747 3.27 19 0.3691
0.4962 3.44 20 0.3293
0.4516 3.61 21 0.2935
0.3995 3.78 22 0.2614
0.3618 3.96 23 0.2346
0.3246 4.13 24 0.2129
0.2929 4.3 25 0.1938
0.278 4.47 26 0.1770
0.2493 4.65 27 0.1627
0.2273 4.82 28 0.1500
0.2067 4.99 29 0.1381
0.1917 5.16 30 0.1274
0.1805 5.33 31 0.1174
0.1557 5.51 32 0.1081
0.1495 5.68 33 0.1002
0.1394 5.85 34 0.0933
0.1261 6.02 35 0.0868
0.1155 6.19 36 0.0809
0.1114 6.37 37 0.0755
0.1041 6.54 38 0.0705
0.0952 6.71 39 0.0657
0.0881 6.88 40 0.0615
0.0823 7.05 41 0.0577
0.0778 7.23 42 0.0545
0.071 7.4 43 0.0515
0.07 7.57 44 0.0487
0.0625 7.74 45 0.0463
0.0589 7.91 46 0.0440
0.0567 8.09 47 0.0422
0.0537 8.26 48 0.0411
0.05 8.43 49 0.0398
0.0472 8.6 50 0.0384
0.0458 8.77 51 0.0363
0.0455 8.95 52 0.0347
0.0412 9.12 53 0.0340
0.0414 9.29 54 0.0326
0.0403 9.46 55 0.0333
0.0384 9.63 56 0.0303
0.0353 9.81 57 0.0298
0.0348 9.98 58 0.0293
0.0342 10.15 59 0.0275
0.0311 10.32 60 0.0272
0.0317 10.49 61 0.0270
0.0315 10.67 62 0.0261
0.0289 10.84 63 0.0253
0.0285 11.01 64 0.0247
0.0273 11.18 65 0.0244
0.0277 11.35 66 0.0240
0.0267 11.53 67 0.0237
0.0263 11.7 68 0.0237
0.0258 11.87 69 0.0237
0.0254 12.04 70 0.0238
0.0248 12.22 71 0.0239
0.0246 12.39 72 0.0239
0.0249 12.56 73 0.0237
0.0239 12.73 74 0.0236
0.0247 12.9 75 0.0236

Framework versions

  • Transformers 4.38.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.17.1
  • Tokenizers 0.15.2
Downloads last month
1
Safetensors
Model size
406M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mixtralyanis/bart_samsum_v2

Finetuned
(252)
this model