dahe827's picture
End of training
29bf7b9 verified
metadata
license: mit
base_model: xlnet/xlnet-base-cased
tags:
  - generated_from_trainer
metrics:
  - f1
model-index:
  - name: xlnet-base-cased-airlines-news-multi-label
    results: []

xlnet-base-cased-airlines-news-multi-label

This model is a fine-tuned version of xlnet/xlnet-base-cased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2250
  • F1: 0.9209
  • Jaccard: 0.6829

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 9e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 150
  • num_epochs: 65

Training results

Training Loss Epoch Step Validation Loss F1 Jaccard
No log 1.0 57 0.3815 0.8283 0.4558
No log 2.0 114 0.3186 0.8287 0.4602
No log 3.0 171 0.2839 0.8781 0.5133
No log 4.0 228 0.2660 0.8890 0.5457
No log 5.0 285 0.2532 0.8996 0.5833
No log 6.0 342 0.2471 0.8966 0.5745
No log 7.0 399 0.2412 0.9066 0.6069
No log 8.0 456 0.2393 0.9065 0.5981
0.323 9.0 513 0.2354 0.9043 0.6025
0.323 10.0 570 0.2340 0.9087 0.6077
0.323 11.0 627 0.2326 0.9122 0.6283
0.323 12.0 684 0.2305 0.9161 0.6401
0.323 13.0 741 0.2297 0.9113 0.6350
0.323 14.0 798 0.2284 0.9138 0.6416
0.323 15.0 855 0.2281 0.9130 0.6497
0.323 16.0 912 0.2248 0.9164 0.6527
0.323 17.0 969 0.2228 0.9166 0.6527
0.2463 18.0 1026 0.2232 0.9169 0.6586
0.2463 19.0 1083 0.2243 0.9162 0.6571
0.2463 20.0 1140 0.2236 0.9147 0.6519
0.2463 21.0 1197 0.2255 0.9203 0.6637
0.2463 22.0 1254 0.2261 0.9177 0.6556
0.2463 23.0 1311 0.2226 0.9169 0.6637
0.2463 24.0 1368 0.2226 0.9175 0.6718
0.2463 25.0 1425 0.2246 0.9147 0.6571
0.2463 26.0 1482 0.2240 0.9147 0.6637
0.2313 27.0 1539 0.2237 0.9164 0.6622
0.2313 28.0 1596 0.2235 0.9176 0.6711
0.2313 29.0 1653 0.2228 0.9154 0.6689
0.2313 30.0 1710 0.2220 0.9165 0.6748
0.2313 31.0 1767 0.2231 0.9168 0.6696
0.2313 32.0 1824 0.2231 0.9176 0.6718
0.2313 33.0 1881 0.2241 0.9166 0.6704
0.2313 34.0 1938 0.2224 0.9167 0.6704
0.2313 35.0 1995 0.2219 0.9168 0.6748
0.2248 36.0 2052 0.2250 0.9209 0.6829
0.2248 37.0 2109 0.2235 0.9139 0.6593
0.2248 38.0 2166 0.2226 0.9146 0.6659
0.2248 39.0 2223 0.2228 0.9176 0.6748
0.2248 40.0 2280 0.2227 0.9152 0.6637
0.2248 41.0 2337 0.2225 0.9151 0.6652
0.2248 42.0 2394 0.2224 0.9134 0.6593
0.2248 43.0 2451 0.2225 0.9175 0.6748
0.2201 44.0 2508 0.2219 0.9158 0.6637
0.2201 45.0 2565 0.2222 0.9151 0.6659
0.2201 46.0 2622 0.2211 0.9160 0.6681
0.2201 47.0 2679 0.2214 0.9167 0.6696
0.2201 48.0 2736 0.2218 0.9163 0.6681
0.2201 49.0 2793 0.2217 0.9146 0.6615
0.2201 50.0 2850 0.2216 0.9135 0.6593
0.2201 51.0 2907 0.2217 0.9174 0.6770
0.2201 52.0 2964 0.2219 0.9166 0.6755
0.2162 53.0 3021 0.2219 0.9165 0.6748
0.2162 54.0 3078 0.2214 0.9182 0.6814
0.2162 55.0 3135 0.2211 0.9165 0.6792
0.2162 56.0 3192 0.2211 0.9169 0.6748
0.2162 57.0 3249 0.2211 0.9149 0.6726
0.2162 58.0 3306 0.2209 0.9167 0.6814
0.2162 59.0 3363 0.2213 0.9167 0.6726
0.2162 60.0 3420 0.2215 0.9158 0.6726
0.2162 61.0 3477 0.2211 0.9150 0.6681
0.2157 62.0 3534 0.2211 0.9166 0.6748
0.2157 63.0 3591 0.2209 0.9157 0.6770
0.2157 64.0 3648 0.2209 0.9165 0.6748
0.2157 65.0 3705 0.2209 0.9157 0.6726

Framework versions

  • Transformers 4.41.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1