dahe827's picture
End of training
d7e253e verified
metadata
license: mit
base_model: xlnet/xlnet-base-cased
tags:
  - generated_from_trainer
metrics:
  - f1
model-index:
  - name: xlnet-base-cased-airlines-news-multi-label-7-actions
    results: []

xlnet-base-cased-airlines-news-multi-label-7-actions

This model is a fine-tuned version of xlnet/xlnet-base-cased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2151
  • F1: 0.9188
  • Jaccard: 0.6316

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 150
  • num_epochs: 45

Training results

Training Loss Epoch Step Validation Loss F1 Jaccard
No log 1.0 57 0.4002 0.8330 0.4071
No log 2.0 114 0.3221 0.8443 0.4115
No log 3.0 171 0.2836 0.8703 0.4565
No log 4.0 228 0.2654 0.8731 0.4786
No log 5.0 285 0.2478 0.8825 0.5288
No log 6.0 342 0.2410 0.8925 0.5395
No log 7.0 399 0.2337 0.8990 0.5830
No log 8.0 456 0.2313 0.8997 0.5586
0.3113 9.0 513 0.2287 0.8974 0.5579
0.3113 10.0 570 0.2242 0.9037 0.6080
0.3113 11.0 627 0.2215 0.9036 0.5911
0.3113 12.0 684 0.2209 0.9072 0.5977
0.3113 13.0 741 0.2192 0.9093 0.6169
0.3113 14.0 798 0.2193 0.9076 0.6125
0.3113 15.0 855 0.2206 0.9067 0.6257
0.3113 16.0 912 0.2166 0.9121 0.6265
0.3113 17.0 969 0.2151 0.9188 0.6316
0.2282 18.0 1026 0.2143 0.9107 0.6279
0.2282 19.0 1083 0.2132 0.9105 0.6346
0.2282 20.0 1140 0.2132 0.9106 0.6279
0.2282 21.0 1197 0.2174 0.9122 0.6243
0.2282 22.0 1254 0.2161 0.9139 0.6221
0.2282 23.0 1311 0.2127 0.9119 0.6287
0.2282 24.0 1368 0.2129 0.9119 0.6353
0.2282 25.0 1425 0.2112 0.9147 0.6324
0.2282 26.0 1482 0.2122 0.9129 0.6386
0.2142 27.0 1539 0.2126 0.9146 0.6272
0.2142 28.0 1596 0.2114 0.9144 0.6331
0.2142 29.0 1653 0.2109 0.9159 0.6434
0.2142 30.0 1710 0.2102 0.9165 0.6479
0.2142 31.0 1767 0.2120 0.9114 0.6397
0.2142 32.0 1824 0.2110 0.9125 0.6405
0.2142 33.0 1881 0.2111 0.9093 0.625
0.2142 34.0 1938 0.2110 0.9109 0.6243
0.2142 35.0 1995 0.2111 0.9110 0.6324
0.2084 36.0 2052 0.2127 0.9141 0.6394
0.2084 37.0 2109 0.2112 0.9098 0.6265
0.2084 38.0 2166 0.2109 0.9112 0.6309
0.2084 39.0 2223 0.2109 0.9137 0.6405
0.2084 40.0 2280 0.2111 0.9122 0.6346
0.2084 41.0 2337 0.2111 0.9111 0.6368
0.2084 42.0 2394 0.2113 0.9099 0.6279
0.2084 43.0 2451 0.2112 0.9105 0.6324
0.2054 44.0 2508 0.2111 0.9105 0.6287
0.2054 45.0 2565 0.2110 0.9089 0.6221

Framework versions

  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1