metadata
license: cc-by-sa-4.0
base_model: kiddothe2b/longformer-base-4096
tags:
- generated_from_trainer
metrics:
- f1
model-index:
- name: longformer-base-4096-airlines-news-multi-label
results: []
longformer-base-4096-airlines-news-multi-label
This model is a fine-tuned version of kiddothe2b/longformer-base-4096 on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.2421
- F1: 0.9070
- Roc Auc: 0.6668
- Hamming: 0.9137
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 9e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 65
Training results
Training Loss | Epoch | Step | Validation Loss | F1 | Roc Auc | Hamming |
---|---|---|---|---|---|---|
No log | 1.0 | 57 | 0.3454 | 0.8319 | 0.5 | 0.8850 |
No log | 2.0 | 114 | 0.3372 | 0.8319 | 0.5 | 0.8850 |
No log | 3.0 | 171 | 0.3353 | 0.8319 | 0.5 | 0.8850 |
No log | 4.0 | 228 | 0.3310 | 0.8319 | 0.5 | 0.8850 |
No log | 5.0 | 285 | 0.3278 | 0.8319 | 0.5 | 0.8850 |
No log | 6.0 | 342 | 0.3242 | 0.8319 | 0.5 | 0.8850 |
No log | 7.0 | 399 | 0.3206 | 0.8319 | 0.5 | 0.8850 |
No log | 8.0 | 456 | 0.3168 | 0.8319 | 0.5 | 0.8850 |
0.3599 | 9.0 | 513 | 0.3120 | 0.8319 | 0.5 | 0.8850 |
0.3599 | 10.0 | 570 | 0.3089 | 0.8319 | 0.5 | 0.8850 |
0.3599 | 11.0 | 627 | 0.3039 | 0.8319 | 0.5 | 0.8850 |
0.3599 | 12.0 | 684 | 0.3000 | 0.8319 | 0.5 | 0.8850 |
0.3599 | 13.0 | 741 | 0.2969 | 0.8319 | 0.5 | 0.8850 |
0.3599 | 14.0 | 798 | 0.2932 | 0.8319 | 0.5 | 0.8850 |
0.3599 | 15.0 | 855 | 0.2893 | 0.8449 | 0.5064 | 0.8864 |
0.3599 | 16.0 | 912 | 0.2859 | 0.8449 | 0.5064 | 0.8864 |
0.3599 | 17.0 | 969 | 0.2824 | 0.8449 | 0.5064 | 0.8864 |
0.3111 | 18.0 | 1026 | 0.2800 | 0.8613 | 0.5192 | 0.8894 |
0.3111 | 19.0 | 1083 | 0.2773 | 0.8606 | 0.5160 | 0.8886 |
0.3111 | 20.0 | 1140 | 0.2752 | 0.8586 | 0.5248 | 0.8894 |
0.3111 | 21.0 | 1197 | 0.2727 | 0.8586 | 0.5248 | 0.8894 |
0.3111 | 22.0 | 1254 | 0.2703 | 0.8597 | 0.5280 | 0.8901 |
0.3111 | 23.0 | 1311 | 0.2679 | 0.8761 | 0.5532 | 0.8953 |
0.3111 | 24.0 | 1368 | 0.2665 | 0.8783 | 0.5684 | 0.8975 |
0.3111 | 25.0 | 1425 | 0.2645 | 0.8791 | 0.5688 | 0.8982 |
0.3111 | 26.0 | 1482 | 0.2627 | 0.8789 | 0.5776 | 0.8990 |
0.2854 | 27.0 | 1539 | 0.2611 | 0.8780 | 0.5716 | 0.8982 |
0.2854 | 28.0 | 1596 | 0.2597 | 0.8791 | 0.5688 | 0.8982 |
0.2854 | 29.0 | 1653 | 0.2584 | 0.8818 | 0.5845 | 0.9012 |
0.2854 | 30.0 | 1710 | 0.2570 | 0.8825 | 0.5877 | 0.9019 |
0.2854 | 31.0 | 1767 | 0.2564 | 0.8930 | 0.6405 | 0.9115 |
0.2854 | 32.0 | 1824 | 0.2556 | 0.8913 | 0.6396 | 0.9100 |
0.2854 | 33.0 | 1881 | 0.2547 | 0.8870 | 0.6296 | 0.9071 |
0.2854 | 34.0 | 1938 | 0.2531 | 0.8843 | 0.6029 | 0.9041 |
0.2854 | 35.0 | 1995 | 0.2522 | 0.8912 | 0.6341 | 0.9100 |
0.2722 | 36.0 | 2052 | 0.2516 | 0.8914 | 0.6341 | 0.9100 |
0.2722 | 37.0 | 2109 | 0.2507 | 0.8913 | 0.6369 | 0.9100 |
0.2722 | 38.0 | 2166 | 0.2501 | 0.8899 | 0.6392 | 0.9093 |
0.2722 | 39.0 | 2223 | 0.2491 | 0.8865 | 0.6264 | 0.9063 |
0.2722 | 40.0 | 2280 | 0.2486 | 0.8939 | 0.6409 | 0.9122 |
0.2722 | 41.0 | 2337 | 0.2483 | 0.8921 | 0.6516 | 0.9115 |
0.2722 | 42.0 | 2394 | 0.2474 | 0.8913 | 0.6512 | 0.9108 |
0.2722 | 43.0 | 2451 | 0.2466 | 0.8911 | 0.6341 | 0.9100 |
0.2652 | 44.0 | 2508 | 0.2461 | 0.8950 | 0.6557 | 0.9137 |
0.2652 | 45.0 | 2565 | 0.2459 | 0.8913 | 0.6540 | 0.9108 |
0.2652 | 46.0 | 2622 | 0.2453 | 0.8934 | 0.6521 | 0.9122 |
0.2652 | 47.0 | 2679 | 0.2446 | 0.8950 | 0.6557 | 0.9137 |
0.2652 | 48.0 | 2736 | 0.2445 | 0.8922 | 0.6572 | 0.9115 |
0.2652 | 49.0 | 2793 | 0.2442 | 0.8931 | 0.6521 | 0.9122 |
0.2652 | 50.0 | 2850 | 0.2440 | 0.8938 | 0.6608 | 0.9130 |
0.2652 | 51.0 | 2907 | 0.2436 | 0.8930 | 0.6576 | 0.9122 |
0.2652 | 52.0 | 2964 | 0.2432 | 0.8940 | 0.6553 | 0.9130 |
0.2603 | 53.0 | 3021 | 0.2430 | 0.8940 | 0.6553 | 0.9130 |
0.2603 | 54.0 | 3078 | 0.2428 | 0.8930 | 0.6576 | 0.9122 |
0.2603 | 55.0 | 3135 | 0.2425 | 0.8938 | 0.6608 | 0.9130 |
0.2603 | 56.0 | 3192 | 0.2424 | 0.8904 | 0.6480 | 0.9100 |
0.2603 | 57.0 | 3249 | 0.2424 | 0.8938 | 0.6636 | 0.9130 |
0.2603 | 58.0 | 3306 | 0.2422 | 0.8938 | 0.6636 | 0.9130 |
0.2603 | 59.0 | 3363 | 0.2421 | 0.9070 | 0.6668 | 0.9137 |
0.2603 | 60.0 | 3420 | 0.2419 | 0.9070 | 0.6668 | 0.9137 |
0.2603 | 61.0 | 3477 | 0.2418 | 0.8938 | 0.6636 | 0.9130 |
0.2578 | 62.0 | 3534 | 0.2418 | 0.8938 | 0.6636 | 0.9130 |
0.2578 | 63.0 | 3591 | 0.2416 | 0.8930 | 0.6576 | 0.9122 |
0.2578 | 64.0 | 3648 | 0.2416 | 0.8938 | 0.6608 | 0.9130 |
0.2578 | 65.0 | 3705 | 0.2416 | 0.8930 | 0.6576 | 0.9122 |
Framework versions
- Transformers 4.41.1
- Pytorch 2.3.0+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1