End of training

Browse files

Files changed (7) hide show

README.md +78 -35
config.json +32 -22
model.safetensors +2 -2
special_tokens_map.json +42 -6
tokenizer.json +4 -4
tokenizer_config.json +6 -6
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,11 +1,10 @@
 ---
-license: apache-2.0
-base_model: allenai/longformer-base-4096
 tags:
 - generated_from_trainer
 metrics:
 - f1
-- accuracy
 model-index:
 - name: longformer-base-4096-airlines-news-multi-label
   results: []
@@ -16,12 +15,12 @@ should probably proofread and complete it, then remove this comment. -->
 # longformer-base-4096-airlines-news-multi-label
-This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.4263
-- F1: 0.7073
-- Roc Auc: 0.8173
-- Accuracy: 0.6638
 ## Model description
@@ -40,39 +39,83 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 7e-05
-- train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 150
-- num_epochs: 20
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | F1     | Roc Auc | Accuracy |
-|:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|:--------:|
-| No log        | 1.0   | 118  | 0.3445          | 0.3084 | 0.5914  | 0.5660   |
-| No log        | 2.0   | 236  | 0.2649          | 0.6263 | 0.7544  | 0.6255   |
-| No log        | 3.0   | 354  | 0.2985          | 0.5344 | 0.6874  | 0.6298   |
-| No log        | 4.0   | 472  | 0.2630          | 0.6604 | 0.7867  | 0.6511   |
-| 0.248         | 5.0   | 590  | 0.2887          | 0.6578 | 0.7728  | 0.6511   |
-| 0.248         | 6.0   | 708  | 0.3088          | 0.6515 | 0.7733  | 0.6511   |
-| 0.248         | 7.0   | 826  | 0.3399          | 0.6367 | 0.7679  | 0.6213   |
-| 0.248         | 8.0   | 944  | 0.3477          | 0.6537 | 0.7757  | 0.6383   |
-| 0.0706        | 9.0   | 1062 | 0.3540          | 0.6749 | 0.7959  | 0.6468   |
-| 0.0706        | 10.0  | 1180 | 0.3847          | 0.6649 | 0.8183  | 0.5702   |
-| 0.0706        | 11.0  | 1298 | 0.4104          | 0.6742 | 0.8150  | 0.6043   |
-| 0.0706        | 12.0  | 1416 | 0.3894          | 0.7006 | 0.8177  | 0.6468   |
-| 0.0212        | 13.0  | 1534 | 0.4363          | 0.6706 | 0.8026  | 0.6255   |
-| 0.0212        | 14.0  | 1652 | 0.4135          | 0.6954 | 0.8085  | 0.6638   |
-| 0.0212        | 15.0  | 1770 | 0.4263          | 0.6822 | 0.8132  | 0.6213   |
-| 0.0212        | 16.0  | 1888 | 0.4162          | 0.6972 | 0.8110  | 0.6553   |
-| 0.0057        | 17.0  | 2006 | 0.4319          | 0.6985 | 0.8172  | 0.6468   |
-| 0.0057        | 18.0  | 2124 | 0.4263          | 0.7073 | 0.8173  | 0.6638   |
-| 0.0057        | 19.0  | 2242 | 0.4308          | 0.6988 | 0.8153  | 0.6468   |
-| 0.0057        | 20.0  | 2360 | 0.4288          | 0.7030 | 0.8163  | 0.6553   |
 ### Framework versions

 ---
+license: cc-by-sa-4.0
+base_model: kiddothe2b/longformer-base-4096
 tags:
 - generated_from_trainer
 metrics:
 - f1
 model-index:
 - name: longformer-base-4096-airlines-news-multi-label
   results: []
 # longformer-base-4096-airlines-news-multi-label
+This model is a fine-tuned version of [kiddothe2b/longformer-base-4096](https://huggingface.co/kiddothe2b/longformer-base-4096) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.2583
+- F1: 0.8916
+- Roc Auc: 0.6172
+- Hamming: 0.8950
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 9e-05
+- train_batch_size: 32
+- eval_batch_size: 32
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 65
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | F1     | Roc Auc | Hamming |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|:-------:|
+| No log        | 1.0   | 57   | 0.3454          | 0.8319 | 0.5     | 0.8850  |
+| No log        | 2.0   | 114  | 0.3372          | 0.8319 | 0.5     | 0.8850  |
+| No log        | 3.0   | 171  | 0.3353          | 0.8319 | 0.5     | 0.8850  |
+| No log        | 4.0   | 228  | 0.3310          | 0.8319 | 0.5     | 0.8850  |
+| No log        | 5.0   | 285  | 0.3278          | 0.8319 | 0.5     | 0.8850  |
+| No log        | 6.0   | 342  | 0.3242          | 0.8319 | 0.5     | 0.8850  |
+| No log        | 7.0   | 399  | 0.3206          | 0.8319 | 0.5     | 0.8850  |
+| No log        | 8.0   | 456  | 0.3168          | 0.8319 | 0.5     | 0.8850  |
+| 0.3599        | 9.0   | 513  | 0.3120          | 0.8319 | 0.5     | 0.8850  |
+| 0.3599        | 10.0  | 570  | 0.3089          | 0.8319 | 0.5     | 0.8850  |
+| 0.3599        | 11.0  | 627  | 0.3039          | 0.8319 | 0.5     | 0.8850  |
+| 0.3599        | 12.0  | 684  | 0.3000          | 0.8319 | 0.5     | 0.8850  |
+| 0.3599        | 13.0  | 741  | 0.2969          | 0.8319 | 0.5     | 0.8850  |
+| 0.3599        | 14.0  | 798  | 0.2932          | 0.8319 | 0.5     | 0.8850  |
+| 0.3599        | 15.0  | 855  | 0.2893          | 0.8449 | 0.5064  | 0.8864  |
+| 0.3599        | 16.0  | 912  | 0.2859          | 0.8449 | 0.5064  | 0.8864  |
+| 0.3599        | 17.0  | 969  | 0.2824          | 0.8449 | 0.5064  | 0.8864  |
+| 0.3111        | 18.0  | 1026 | 0.2800          | 0.8613 | 0.5192  | 0.8894  |
+| 0.3111        | 19.0  | 1083 | 0.2773          | 0.8606 | 0.5160  | 0.8886  |
+| 0.3111        | 20.0  | 1140 | 0.2752          | 0.8586 | 0.5248  | 0.8894  |
+| 0.3111        | 21.0  | 1197 | 0.2727          | 0.8586 | 0.5248  | 0.8894  |
+| 0.3111        | 22.0  | 1254 | 0.2703          | 0.8597 | 0.5280  | 0.8901  |
+| 0.3111        | 23.0  | 1311 | 0.2679          | 0.8761 | 0.5532  | 0.8953  |
+| 0.3111        | 24.0  | 1368 | 0.2665          | 0.8783 | 0.5684  | 0.8975  |
+| 0.3111        | 25.0  | 1425 | 0.2645          | 0.8791 | 0.5688  | 0.8982  |
+| 0.3111        | 26.0  | 1482 | 0.2627          | 0.8789 | 0.5776  | 0.8990  |
+| 0.2854        | 27.0  | 1539 | 0.2611          | 0.8780 | 0.5716  | 0.8982  |
+| 0.2854        | 28.0  | 1596 | 0.2597          | 0.8791 | 0.5688  | 0.8982  |
+| 0.2854        | 29.0  | 1653 | 0.2584          | 0.8818 | 0.5845  | 0.9012  |
+| 0.2854        | 30.0  | 1710 | 0.2570          | 0.8825 | 0.5877  | 0.9019  |
+| 0.2854        | 31.0  | 1767 | 0.2564          | 0.8930 | 0.6405  | 0.9115  |
+| 0.2854        | 32.0  | 1824 | 0.2556          | 0.8913 | 0.6396  | 0.9100  |
+| 0.2854        | 33.0  | 1881 | 0.2547          | 0.8870 | 0.6296  | 0.9071  |
+| 0.2854        | 34.0  | 1938 | 0.2531          | 0.8843 | 0.6029  | 0.9041  |
+| 0.2854        | 35.0  | 1995 | 0.2522          | 0.8912 | 0.6341  | 0.9100  |
+| 0.2722        | 36.0  | 2052 | 0.2516          | 0.8914 | 0.6341  | 0.9100  |
+| 0.2722        | 37.0  | 2109 | 0.2507          | 0.8913 | 0.6369  | 0.9100  |
+| 0.2722        | 38.0  | 2166 | 0.2501          | 0.8899 | 0.6392  | 0.9093  |
+| 0.2722        | 39.0  | 2223 | 0.2491          | 0.8865 | 0.6264  | 0.9063  |
+| 0.2722        | 40.0  | 2280 | 0.2486          | 0.8939 | 0.6409  | 0.9122  |
+| 0.2722        | 41.0  | 2337 | 0.2483          | 0.8921 | 0.6516  | 0.9115  |
+| 0.2722        | 42.0  | 2394 | 0.2474          | 0.8913 | 0.6512  | 0.9108  |
+| 0.2722        | 43.0  | 2451 | 0.2466          | 0.8911 | 0.6341  | 0.9100  |
+| 0.2652        | 44.0  | 2508 | 0.2461          | 0.8950 | 0.6557  | 0.9137  |
+| 0.2652        | 45.0  | 2565 | 0.2459          | 0.8913 | 0.6540  | 0.9108  |
+| 0.2652        | 46.0  | 2622 | 0.2453          | 0.8934 | 0.6521  | 0.9122  |
+| 0.2652        | 47.0  | 2679 | 0.2446          | 0.8950 | 0.6557  | 0.9137  |
+| 0.2652        | 48.0  | 2736 | 0.2445          | 0.8922 | 0.6572  | 0.9115  |
+| 0.2652        | 49.0  | 2793 | 0.2442          | 0.8931 | 0.6521  | 0.9122  |
+| 0.2652        | 50.0  | 2850 | 0.2440          | 0.8938 | 0.6608  | 0.9130  |
+| 0.2652        | 51.0  | 2907 | 0.2436          | 0.8930 | 0.6576  | 0.9122  |
+| 0.2652        | 52.0  | 2964 | 0.2432          | 0.8940 | 0.6553  | 0.9130  |
+| 0.2603        | 53.0  | 3021 | 0.2430          | 0.8940 | 0.6553  | 0.9130  |
+| 0.2603        | 54.0  | 3078 | 0.2428          | 0.8930 | 0.6576  | 0.9122  |
+| 0.2603        | 55.0  | 3135 | 0.2425          | 0.8938 | 0.6608  | 0.9130  |
+| 0.2603        | 56.0  | 3192 | 0.2424          | 0.8904 | 0.6480  | 0.9100  |
+| 0.2603        | 57.0  | 3249 | 0.2424          | 0.8938 | 0.6636  | 0.9130  |
+| 0.2603        | 58.0  | 3306 | 0.2422          | 0.8938 | 0.6636  | 0.9130  |
+| 0.2603        | 59.0  | 3363 | 0.2421          | 0.9070 | 0.6668  | 0.9137  |
+| 0.2603        | 60.0  | 3420 | 0.2419          | 0.9070 | 0.6668  | 0.9137  |
+| 0.2603        | 61.0  | 3477 | 0.2418          | 0.8938 | 0.6636  | 0.9130  |
+| 0.2578        | 62.0  | 3534 | 0.2418          | 0.8938 | 0.6636  | 0.9130  |
+| 0.2578        | 63.0  | 3591 | 0.2416          | 0.8930 | 0.6576  | 0.9122  |
+| 0.2578        | 64.0  | 3648 | 0.2416          | 0.8938 | 0.6608  | 0.9130  |
+| 0.2578        | 65.0  | 3705 | 0.2416          | 0.8930 | 0.6576  | 0.9122  |
 ### Framework versions

config.json CHANGED Viewed

@@ -1,25 +1,27 @@
 {
-  "_name_or_path": "allenai/longformer-base-4096",
   "architectures": [
     "LongformerForSequenceClassification"
   ],
   "attention_mode": "longformer",
   "attention_probs_dropout_prob": 0.1,
   "attention_window": [
-    512,
-    512,
-    512,
-    512,
-    512,
-    512,
-    512,
-    512,
-    512,
-    512,
-    512,
-    512
   ],
   "bos_token_id": 0,
   "eos_token_id": 2,
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
@@ -27,32 +29,40 @@
   "hidden_size": 768,
   "id2label": {
     "0": "capacity expansion",
-    "1": "market expansion",
-    "2": "merger & acquisition and finance investments",
-    "3": "outsourcing and alliance",
-    "4": "product introductions and improvements"
   },
   "ignore_attention_mask": false,
   "initializer_range": 0.02,
   "intermediate_size": 3072,
   "label2id": {
     "capacity expansion": 0,
-    "market expansion": 1,
-    "merger & acquisition and finance investments": 2,
-    "outsourcing and alliance": 3,
-    "product introductions and improvements": 4
   },
   "layer_norm_eps": 1e-05,
-  "max_position_embeddings": 4098,
   "model_type": "longformer",
   "num_attention_heads": 12,
   "num_hidden_layers": 12,
   "onnx_export": false,
   "pad_token_id": 1,
   "problem_type": "multi_label_classification",
   "sep_token_id": 2,
   "torch_dtype": "float32",
   "transformers_version": "4.41.1",
   "type_vocab_size": 1,
   "vocab_size": 50265
 }

 {
+  "_name_or_path": "kiddothe2b/longformer-base-4096",
   "architectures": [
     "LongformerForSequenceClassification"
   ],
   "attention_mode": "longformer",
   "attention_probs_dropout_prob": 0.1,
   "attention_window": [
+    128,
+    128,
+    128,
+    128,
+    128,
+    128,
+    128,
+    128,
+    128,
+    128,
+    128,
+    128
   ],
   "bos_token_id": 0,
+  "classifier_dropout": null,
+  "cls_token_id": 0,
   "eos_token_id": 2,
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
   "hidden_size": 768,
   "id2label": {
     "0": "capacity expansion",
+    "1": "legal action",
+    "2": "market expansion",
+    "3": "merger & acquisition and finance investments",
+    "4": "outsourcing and alliance",
+    "5": "product introductions and improvements"
   },
   "ignore_attention_mask": false,
   "initializer_range": 0.02,
   "intermediate_size": 3072,
   "label2id": {
     "capacity expansion": 0,
+    "legal action": 1,
+    "market expansion": 2,
+    "merger & acquisition and finance investments": 3,
+    "outsourcing and alliance": 4,
+    "product introductions and improvements": 5
   },
   "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 4099,
+  "max_sentence_length": 128,
+  "max_sentence_size": 128,
+  "max_sentences": 8,
+  "model_max_length": 4096,
   "model_type": "longformer",
   "num_attention_heads": 12,
   "num_hidden_layers": 12,
   "onnx_export": false,
   "pad_token_id": 1,
+  "position_embedding_type": "absolute",
   "problem_type": "multi_label_classification",
   "sep_token_id": 2,
   "torch_dtype": "float32",
   "transformers_version": "4.41.1",
   "type_vocab_size": 1,
+  "use_cache": true,
   "vocab_size": 50265
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:86b1106bd0591ef41dd94025bdf55aeb65194b95e94bbf444326c6eeaf118cf8
-size 594687412

 version https://git-lfs.github.com/spec/v1
+oid sha256:487dae951a50328e3fa75be613532932c105f2c3cc4975df9b5eae770184b481
+size 595481184

special_tokens_map.json CHANGED Viewed

@@ -1,7 +1,25 @@
 {
-  "bos_token": "<s>",
-  "cls_token": "<s>",
-  "eos_token": "</s>",
   "mask_token": {
     "content": "<mask>",
     "lstrip": true,
@@ -9,7 +27,25 @@
     "rstrip": false,
     "single_word": false
   },
-  "pad_token": "<pad>",
-  "sep_token": "</s>",
-  "unk_token": "<unk>"
 }

 {
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
   "mask_token": {
     "content": "<mask>",
     "lstrip": true,
     "rstrip": false,
     "single_word": false
   },
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
 }

tokenizer.json CHANGED Viewed

@@ -23,7 +23,7 @@
       "single_word": false,
       "lstrip": false,
       "rstrip": false,
-      "normalized": true,
       "special": true
     },
     {
@@ -32,7 +32,7 @@
       "single_word": false,
       "lstrip": false,
       "rstrip": false,
-      "normalized": true,
       "special": true
     },
     {
@@ -41,7 +41,7 @@
       "single_word": false,
       "lstrip": false,
       "rstrip": false,
-      "normalized": true,
       "special": true
     },
     {
@@ -50,7 +50,7 @@
       "single_word": false,
       "lstrip": false,
       "rstrip": false,
-      "normalized": true,
       "special": true
     },
     {

       "single_word": false,
       "lstrip": false,
       "rstrip": false,
+      "normalized": false,
       "special": true
     },
     {
       "single_word": false,
       "lstrip": false,
       "rstrip": false,
+      "normalized": false,
       "special": true
     },
     {
       "single_word": false,
       "lstrip": false,
       "rstrip": false,
+      "normalized": false,
       "special": true
     },
     {
       "single_word": false,
       "lstrip": false,
       "rstrip": false,
+      "normalized": false,
       "special": true
     },
     {

tokenizer_config.json CHANGED Viewed

@@ -4,7 +4,7 @@
     "0": {
       "content": "<s>",
       "lstrip": false,
-      "normalized": true,
       "rstrip": false,
       "single_word": false,
       "special": true
@@ -12,7 +12,7 @@
     "1": {
       "content": "<pad>",
       "lstrip": false,
-      "normalized": true,
       "rstrip": false,
       "single_word": false,
       "special": true
@@ -20,7 +20,7 @@
     "2": {
       "content": "</s>",
       "lstrip": false,
-      "normalized": true,
       "rstrip": false,
       "single_word": false,
       "special": true
@@ -28,7 +28,7 @@
     "3": {
       "content": "<unk>",
       "lstrip": false,
-      "normalized": true,
       "rstrip": false,
       "single_word": false,
       "special": true
@@ -48,10 +48,10 @@
   "eos_token": "</s>",
   "errors": "replace",
   "mask_token": "<mask>",
-  "model_max_length": 1000000000000000019884624838656,
   "pad_token": "<pad>",
   "sep_token": "</s>",
-  "tokenizer_class": "LongformerTokenizer",
   "trim_offsets": true,
   "unk_token": "<unk>"
 }

     "0": {
       "content": "<s>",
       "lstrip": false,
+      "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     "1": {
       "content": "<pad>",
       "lstrip": false,
+      "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     "2": {
       "content": "</s>",
       "lstrip": false,
+      "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
     "3": {
       "content": "<unk>",
       "lstrip": false,
+      "normalized": false,
       "rstrip": false,
       "single_word": false,
       "special": true
   "eos_token": "</s>",
   "errors": "replace",
   "mask_token": "<mask>",
+  "model_max_length": 512,
   "pad_token": "<pad>",
   "sep_token": "</s>",
+  "tokenizer_class": "RobertaTokenizer",
   "trim_offsets": true,
   "unk_token": "<unk>"
 }

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2c1893cb495a54e4ccb189d566b9dcfe0182b2da3067bd1c7d6ecd02a4422e8f
 size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:d9a3e90efe4505fb6b5cb17cf26f0e62a7d12ef8a97a6fdb6ae37d03e94a5c8c
 size 5176