dahe827's picture
End of training
c655b34 verified
metadata
license: cc-by-sa-4.0
base_model: kiddothe2b/longformer-base-4096
tags:
  - generated_from_trainer
metrics:
  - f1
model-index:
  - name: longformer-base-4096-airlines-news-multi-label
    results: []

longformer-base-4096-airlines-news-multi-label

This model is a fine-tuned version of kiddothe2b/longformer-base-4096 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2421
  • F1: 0.9070
  • Roc Auc: 0.6668
  • Hamming: 0.9137

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 9e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 65

Training results

Training Loss Epoch Step Validation Loss F1 Roc Auc Hamming
No log 1.0 57 0.3454 0.8319 0.5 0.8850
No log 2.0 114 0.3372 0.8319 0.5 0.8850
No log 3.0 171 0.3353 0.8319 0.5 0.8850
No log 4.0 228 0.3310 0.8319 0.5 0.8850
No log 5.0 285 0.3278 0.8319 0.5 0.8850
No log 6.0 342 0.3242 0.8319 0.5 0.8850
No log 7.0 399 0.3206 0.8319 0.5 0.8850
No log 8.0 456 0.3168 0.8319 0.5 0.8850
0.3599 9.0 513 0.3120 0.8319 0.5 0.8850
0.3599 10.0 570 0.3089 0.8319 0.5 0.8850
0.3599 11.0 627 0.3039 0.8319 0.5 0.8850
0.3599 12.0 684 0.3000 0.8319 0.5 0.8850
0.3599 13.0 741 0.2969 0.8319 0.5 0.8850
0.3599 14.0 798 0.2932 0.8319 0.5 0.8850
0.3599 15.0 855 0.2893 0.8449 0.5064 0.8864
0.3599 16.0 912 0.2859 0.8449 0.5064 0.8864
0.3599 17.0 969 0.2824 0.8449 0.5064 0.8864
0.3111 18.0 1026 0.2800 0.8613 0.5192 0.8894
0.3111 19.0 1083 0.2773 0.8606 0.5160 0.8886
0.3111 20.0 1140 0.2752 0.8586 0.5248 0.8894
0.3111 21.0 1197 0.2727 0.8586 0.5248 0.8894
0.3111 22.0 1254 0.2703 0.8597 0.5280 0.8901
0.3111 23.0 1311 0.2679 0.8761 0.5532 0.8953
0.3111 24.0 1368 0.2665 0.8783 0.5684 0.8975
0.3111 25.0 1425 0.2645 0.8791 0.5688 0.8982
0.3111 26.0 1482 0.2627 0.8789 0.5776 0.8990
0.2854 27.0 1539 0.2611 0.8780 0.5716 0.8982
0.2854 28.0 1596 0.2597 0.8791 0.5688 0.8982
0.2854 29.0 1653 0.2584 0.8818 0.5845 0.9012
0.2854 30.0 1710 0.2570 0.8825 0.5877 0.9019
0.2854 31.0 1767 0.2564 0.8930 0.6405 0.9115
0.2854 32.0 1824 0.2556 0.8913 0.6396 0.9100
0.2854 33.0 1881 0.2547 0.8870 0.6296 0.9071
0.2854 34.0 1938 0.2531 0.8843 0.6029 0.9041
0.2854 35.0 1995 0.2522 0.8912 0.6341 0.9100
0.2722 36.0 2052 0.2516 0.8914 0.6341 0.9100
0.2722 37.0 2109 0.2507 0.8913 0.6369 0.9100
0.2722 38.0 2166 0.2501 0.8899 0.6392 0.9093
0.2722 39.0 2223 0.2491 0.8865 0.6264 0.9063
0.2722 40.0 2280 0.2486 0.8939 0.6409 0.9122
0.2722 41.0 2337 0.2483 0.8921 0.6516 0.9115
0.2722 42.0 2394 0.2474 0.8913 0.6512 0.9108
0.2722 43.0 2451 0.2466 0.8911 0.6341 0.9100
0.2652 44.0 2508 0.2461 0.8950 0.6557 0.9137
0.2652 45.0 2565 0.2459 0.8913 0.6540 0.9108
0.2652 46.0 2622 0.2453 0.8934 0.6521 0.9122
0.2652 47.0 2679 0.2446 0.8950 0.6557 0.9137
0.2652 48.0 2736 0.2445 0.8922 0.6572 0.9115
0.2652 49.0 2793 0.2442 0.8931 0.6521 0.9122
0.2652 50.0 2850 0.2440 0.8938 0.6608 0.9130
0.2652 51.0 2907 0.2436 0.8930 0.6576 0.9122
0.2652 52.0 2964 0.2432 0.8940 0.6553 0.9130
0.2603 53.0 3021 0.2430 0.8940 0.6553 0.9130
0.2603 54.0 3078 0.2428 0.8930 0.6576 0.9122
0.2603 55.0 3135 0.2425 0.8938 0.6608 0.9130
0.2603 56.0 3192 0.2424 0.8904 0.6480 0.9100
0.2603 57.0 3249 0.2424 0.8938 0.6636 0.9130
0.2603 58.0 3306 0.2422 0.8938 0.6636 0.9130
0.2603 59.0 3363 0.2421 0.9070 0.6668 0.9137
0.2603 60.0 3420 0.2419 0.9070 0.6668 0.9137
0.2603 61.0 3477 0.2418 0.8938 0.6636 0.9130
0.2578 62.0 3534 0.2418 0.8938 0.6636 0.9130
0.2578 63.0 3591 0.2416 0.8930 0.6576 0.9122
0.2578 64.0 3648 0.2416 0.8938 0.6608 0.9130
0.2578 65.0 3705 0.2416 0.8930 0.6576 0.9122

Framework versions

  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1