stefan-it's picture
Upload folder using huggingface_hub
fd24f19
2023-10-17 16:47:16,055 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:16,056 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 16:47:16,057 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:16,057 MultiCorpus: 14465 train + 1392 dev + 2432 test sentences
- NER_HIPE_2022 Corpus: 14465 train + 1392 dev + 2432 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/letemps/fr/with_doc_seperator
2023-10-17 16:47:16,057 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:16,057 Train: 14465 sentences
2023-10-17 16:47:16,057 (train_with_dev=False, train_with_test=False)
2023-10-17 16:47:16,057 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:16,057 Training Params:
2023-10-17 16:47:16,057 - learning_rate: "3e-05"
2023-10-17 16:47:16,057 - mini_batch_size: "8"
2023-10-17 16:47:16,057 - max_epochs: "10"
2023-10-17 16:47:16,057 - shuffle: "True"
2023-10-17 16:47:16,057 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:16,057 Plugins:
2023-10-17 16:47:16,057 - TensorboardLogger
2023-10-17 16:47:16,057 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 16:47:16,057 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:16,057 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 16:47:16,057 - metric: "('micro avg', 'f1-score')"
2023-10-17 16:47:16,057 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:16,057 Computation:
2023-10-17 16:47:16,057 - compute on device: cuda:0
2023-10-17 16:47:16,058 - embedding storage: none
2023-10-17 16:47:16,058 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:16,058 Model training base path: "hmbench-letemps/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 16:47:16,058 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:16,058 ----------------------------------------------------------------------------------------------------
2023-10-17 16:47:16,058 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 16:47:28,647 epoch 1 - iter 180/1809 - loss 2.37829275 - time (sec): 12.59 - samples/sec: 2902.52 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:47:41,603 epoch 1 - iter 360/1809 - loss 1.29142331 - time (sec): 25.54 - samples/sec: 2959.16 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:47:54,374 epoch 1 - iter 540/1809 - loss 0.91673625 - time (sec): 38.31 - samples/sec: 2962.10 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:48:07,489 epoch 1 - iter 720/1809 - loss 0.72239703 - time (sec): 51.43 - samples/sec: 2963.13 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:48:20,496 epoch 1 - iter 900/1809 - loss 0.60634230 - time (sec): 64.44 - samples/sec: 2936.89 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:48:33,381 epoch 1 - iter 1080/1809 - loss 0.52686295 - time (sec): 77.32 - samples/sec: 2940.72 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:48:46,313 epoch 1 - iter 1260/1809 - loss 0.46693740 - time (sec): 90.25 - samples/sec: 2944.10 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:48:59,460 epoch 1 - iter 1440/1809 - loss 0.42178664 - time (sec): 103.40 - samples/sec: 2949.31 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:49:12,569 epoch 1 - iter 1620/1809 - loss 0.38783448 - time (sec): 116.51 - samples/sec: 2935.62 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:49:25,738 epoch 1 - iter 1800/1809 - loss 0.36040896 - time (sec): 129.68 - samples/sec: 2918.82 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:49:26,324 ----------------------------------------------------------------------------------------------------
2023-10-17 16:49:26,324 EPOCH 1 done: loss 0.3594 - lr: 0.000030
2023-10-17 16:49:31,785 DEV : loss 0.10473097860813141 - f1-score (micro avg) 0.6133
2023-10-17 16:49:31,826 saving best model
2023-10-17 16:49:32,320 ----------------------------------------------------------------------------------------------------
2023-10-17 16:49:45,341 epoch 2 - iter 180/1809 - loss 0.09588522 - time (sec): 13.02 - samples/sec: 2975.94 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:49:58,243 epoch 2 - iter 360/1809 - loss 0.08882249 - time (sec): 25.92 - samples/sec: 2950.13 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:50:11,235 epoch 2 - iter 540/1809 - loss 0.08363106 - time (sec): 38.91 - samples/sec: 2948.32 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:50:23,803 epoch 2 - iter 720/1809 - loss 0.08455393 - time (sec): 51.48 - samples/sec: 2937.41 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:50:37,115 epoch 2 - iter 900/1809 - loss 0.08508426 - time (sec): 64.79 - samples/sec: 2906.26 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:50:49,757 epoch 2 - iter 1080/1809 - loss 0.08610959 - time (sec): 77.44 - samples/sec: 2898.77 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:51:03,116 epoch 2 - iter 1260/1809 - loss 0.08766651 - time (sec): 90.80 - samples/sec: 2888.85 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:51:16,240 epoch 2 - iter 1440/1809 - loss 0.08705399 - time (sec): 103.92 - samples/sec: 2899.26 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:51:29,593 epoch 2 - iter 1620/1809 - loss 0.08720869 - time (sec): 117.27 - samples/sec: 2886.58 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:51:42,832 epoch 2 - iter 1800/1809 - loss 0.08686349 - time (sec): 130.51 - samples/sec: 2895.75 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:51:43,549 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:43,550 EPOCH 2 done: loss 0.0868 - lr: 0.000027
2023-10-17 16:51:50,679 DEV : loss 0.09997577220201492 - f1-score (micro avg) 0.6206
2023-10-17 16:51:50,720 saving best model
2023-10-17 16:51:51,301 ----------------------------------------------------------------------------------------------------
2023-10-17 16:52:04,562 epoch 3 - iter 180/1809 - loss 0.06057686 - time (sec): 13.26 - samples/sec: 2882.78 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:52:17,739 epoch 3 - iter 360/1809 - loss 0.05898876 - time (sec): 26.44 - samples/sec: 2894.23 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:52:30,634 epoch 3 - iter 540/1809 - loss 0.05667154 - time (sec): 39.33 - samples/sec: 2893.24 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:52:43,874 epoch 3 - iter 720/1809 - loss 0.05829348 - time (sec): 52.57 - samples/sec: 2883.78 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:52:56,734 epoch 3 - iter 900/1809 - loss 0.05931267 - time (sec): 65.43 - samples/sec: 2887.88 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:53:09,638 epoch 3 - iter 1080/1809 - loss 0.05999951 - time (sec): 78.34 - samples/sec: 2893.92 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:53:23,322 epoch 3 - iter 1260/1809 - loss 0.05998945 - time (sec): 92.02 - samples/sec: 2880.58 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:53:37,421 epoch 3 - iter 1440/1809 - loss 0.06121733 - time (sec): 106.12 - samples/sec: 2846.19 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:53:51,481 epoch 3 - iter 1620/1809 - loss 0.06250987 - time (sec): 120.18 - samples/sec: 2822.20 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:54:05,576 epoch 3 - iter 1800/1809 - loss 0.06260649 - time (sec): 134.27 - samples/sec: 2818.01 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:54:06,170 ----------------------------------------------------------------------------------------------------
2023-10-17 16:54:06,171 EPOCH 3 done: loss 0.0627 - lr: 0.000023
2023-10-17 16:54:12,478 DEV : loss 0.13520988821983337 - f1-score (micro avg) 0.6323
2023-10-17 16:54:12,520 saving best model
2023-10-17 16:54:13,120 ----------------------------------------------------------------------------------------------------
2023-10-17 16:54:26,854 epoch 4 - iter 180/1809 - loss 0.03589391 - time (sec): 13.73 - samples/sec: 2725.59 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:54:41,062 epoch 4 - iter 360/1809 - loss 0.04243949 - time (sec): 27.94 - samples/sec: 2704.29 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:54:55,361 epoch 4 - iter 540/1809 - loss 0.04312213 - time (sec): 42.24 - samples/sec: 2711.73 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:55:08,899 epoch 4 - iter 720/1809 - loss 0.04407922 - time (sec): 55.78 - samples/sec: 2715.57 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:55:21,317 epoch 4 - iter 900/1809 - loss 0.04309184 - time (sec): 68.20 - samples/sec: 2751.30 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:55:35,293 epoch 4 - iter 1080/1809 - loss 0.04461675 - time (sec): 82.17 - samples/sec: 2765.77 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:55:47,893 epoch 4 - iter 1260/1809 - loss 0.04482022 - time (sec): 94.77 - samples/sec: 2791.18 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:56:00,644 epoch 4 - iter 1440/1809 - loss 0.04402527 - time (sec): 107.52 - samples/sec: 2805.31 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:56:13,843 epoch 4 - iter 1620/1809 - loss 0.04508043 - time (sec): 120.72 - samples/sec: 2821.11 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:56:26,940 epoch 4 - iter 1800/1809 - loss 0.04577307 - time (sec): 133.82 - samples/sec: 2826.69 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:56:27,553 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:27,554 EPOCH 4 done: loss 0.0458 - lr: 0.000020
2023-10-17 16:56:33,892 DEV : loss 0.19658434391021729 - f1-score (micro avg) 0.65
2023-10-17 16:56:33,932 saving best model
2023-10-17 16:56:34,506 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:47,272 epoch 5 - iter 180/1809 - loss 0.02885208 - time (sec): 12.76 - samples/sec: 2976.47 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:57:00,161 epoch 5 - iter 360/1809 - loss 0.03100421 - time (sec): 25.65 - samples/sec: 2946.51 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:57:13,078 epoch 5 - iter 540/1809 - loss 0.03368269 - time (sec): 38.57 - samples/sec: 2921.39 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:57:25,741 epoch 5 - iter 720/1809 - loss 0.03096940 - time (sec): 51.23 - samples/sec: 2936.84 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:57:38,316 epoch 5 - iter 900/1809 - loss 0.03114871 - time (sec): 63.81 - samples/sec: 2937.46 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:57:50,872 epoch 5 - iter 1080/1809 - loss 0.03098647 - time (sec): 76.36 - samples/sec: 2935.27 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:58:04,501 epoch 5 - iter 1260/1809 - loss 0.03174442 - time (sec): 89.99 - samples/sec: 2923.36 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:58:18,452 epoch 5 - iter 1440/1809 - loss 0.03137972 - time (sec): 103.94 - samples/sec: 2896.31 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:58:32,933 epoch 5 - iter 1620/1809 - loss 0.03319045 - time (sec): 118.42 - samples/sec: 2866.30 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:58:47,625 epoch 5 - iter 1800/1809 - loss 0.03251336 - time (sec): 133.12 - samples/sec: 2841.97 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:58:48,292 ----------------------------------------------------------------------------------------------------
2023-10-17 16:58:48,292 EPOCH 5 done: loss 0.0324 - lr: 0.000017
2023-10-17 16:58:55,306 DEV : loss 0.2977985441684723 - f1-score (micro avg) 0.6377
2023-10-17 16:58:55,351 ----------------------------------------------------------------------------------------------------
2023-10-17 16:59:09,632 epoch 6 - iter 180/1809 - loss 0.02528166 - time (sec): 14.28 - samples/sec: 2659.02 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:59:22,461 epoch 6 - iter 360/1809 - loss 0.02336089 - time (sec): 27.11 - samples/sec: 2746.08 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:59:36,882 epoch 6 - iter 540/1809 - loss 0.02596319 - time (sec): 41.53 - samples/sec: 2699.54 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:59:51,458 epoch 6 - iter 720/1809 - loss 0.02376158 - time (sec): 56.11 - samples/sec: 2706.80 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:00:04,585 epoch 6 - iter 900/1809 - loss 0.02282026 - time (sec): 69.23 - samples/sec: 2735.97 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:00:18,532 epoch 6 - iter 1080/1809 - loss 0.02378981 - time (sec): 83.18 - samples/sec: 2736.20 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:00:32,532 epoch 6 - iter 1260/1809 - loss 0.02488103 - time (sec): 97.18 - samples/sec: 2705.92 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:00:47,158 epoch 6 - iter 1440/1809 - loss 0.02406285 - time (sec): 111.81 - samples/sec: 2693.01 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:01:00,454 epoch 6 - iter 1620/1809 - loss 0.02458081 - time (sec): 125.10 - samples/sec: 2712.95 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:01:14,994 epoch 6 - iter 1800/1809 - loss 0.02416930 - time (sec): 139.64 - samples/sec: 2709.21 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:01:15,660 ----------------------------------------------------------------------------------------------------
2023-10-17 17:01:15,661 EPOCH 6 done: loss 0.0241 - lr: 0.000013
2023-10-17 17:01:22,099 DEV : loss 0.2875325679779053 - f1-score (micro avg) 0.6463
2023-10-17 17:01:22,141 ----------------------------------------------------------------------------------------------------
2023-10-17 17:01:36,174 epoch 7 - iter 180/1809 - loss 0.01469512 - time (sec): 14.03 - samples/sec: 2577.49 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:01:50,551 epoch 7 - iter 360/1809 - loss 0.01466126 - time (sec): 28.41 - samples/sec: 2562.51 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:02:04,459 epoch 7 - iter 540/1809 - loss 0.01507780 - time (sec): 42.32 - samples/sec: 2582.29 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:02:18,493 epoch 7 - iter 720/1809 - loss 0.01484288 - time (sec): 56.35 - samples/sec: 2631.16 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:02:32,995 epoch 7 - iter 900/1809 - loss 0.01551643 - time (sec): 70.85 - samples/sec: 2653.49 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:02:47,523 epoch 7 - iter 1080/1809 - loss 0.01579238 - time (sec): 85.38 - samples/sec: 2661.61 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:03:01,558 epoch 7 - iter 1260/1809 - loss 0.01541974 - time (sec): 99.42 - samples/sec: 2659.01 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:03:15,932 epoch 7 - iter 1440/1809 - loss 0.01565492 - time (sec): 113.79 - samples/sec: 2650.76 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:03:30,407 epoch 7 - iter 1620/1809 - loss 0.01578131 - time (sec): 128.26 - samples/sec: 2646.94 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:03:45,544 epoch 7 - iter 1800/1809 - loss 0.01577763 - time (sec): 143.40 - samples/sec: 2635.32 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:03:46,233 ----------------------------------------------------------------------------------------------------
2023-10-17 17:03:46,234 EPOCH 7 done: loss 0.0157 - lr: 0.000010
2023-10-17 17:03:52,481 DEV : loss 0.3317669928073883 - f1-score (micro avg) 0.6604
2023-10-17 17:03:52,524 saving best model
2023-10-17 17:03:53,119 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:06,717 epoch 8 - iter 180/1809 - loss 0.01083201 - time (sec): 13.60 - samples/sec: 2733.81 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:04:20,695 epoch 8 - iter 360/1809 - loss 0.01085551 - time (sec): 27.57 - samples/sec: 2672.29 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:04:34,811 epoch 8 - iter 540/1809 - loss 0.01160109 - time (sec): 41.69 - samples/sec: 2662.51 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:04:48,963 epoch 8 - iter 720/1809 - loss 0.01150271 - time (sec): 55.84 - samples/sec: 2688.28 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:05:02,029 epoch 8 - iter 900/1809 - loss 0.01158055 - time (sec): 68.91 - samples/sec: 2729.09 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:05:16,268 epoch 8 - iter 1080/1809 - loss 0.01114901 - time (sec): 83.15 - samples/sec: 2709.66 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:05:29,595 epoch 8 - iter 1260/1809 - loss 0.01121147 - time (sec): 96.47 - samples/sec: 2733.10 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:05:43,752 epoch 8 - iter 1440/1809 - loss 0.01133313 - time (sec): 110.63 - samples/sec: 2731.90 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:05:56,559 epoch 8 - iter 1620/1809 - loss 0.01079204 - time (sec): 123.44 - samples/sec: 2744.91 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:06:09,770 epoch 8 - iter 1800/1809 - loss 0.01055259 - time (sec): 136.65 - samples/sec: 2764.47 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:06:10,429 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:10,429 EPOCH 8 done: loss 0.0106 - lr: 0.000007
2023-10-17 17:06:17,407 DEV : loss 0.36872246861457825 - f1-score (micro avg) 0.6586
2023-10-17 17:06:17,455 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:30,029 epoch 9 - iter 180/1809 - loss 0.00589952 - time (sec): 12.57 - samples/sec: 2889.81 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:06:42,937 epoch 9 - iter 360/1809 - loss 0.00549166 - time (sec): 25.48 - samples/sec: 2899.28 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:06:55,861 epoch 9 - iter 540/1809 - loss 0.00605800 - time (sec): 38.40 - samples/sec: 2916.24 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:07:09,026 epoch 9 - iter 720/1809 - loss 0.00638723 - time (sec): 51.57 - samples/sec: 2919.17 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:07:21,620 epoch 9 - iter 900/1809 - loss 0.00618261 - time (sec): 64.16 - samples/sec: 2935.84 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:07:34,600 epoch 9 - iter 1080/1809 - loss 0.00676500 - time (sec): 77.14 - samples/sec: 2934.03 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:07:48,046 epoch 9 - iter 1260/1809 - loss 0.00688842 - time (sec): 90.59 - samples/sec: 2939.28 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:08:00,908 epoch 9 - iter 1440/1809 - loss 0.00706836 - time (sec): 103.45 - samples/sec: 2943.34 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:08:13,738 epoch 9 - iter 1620/1809 - loss 0.00696957 - time (sec): 116.28 - samples/sec: 2937.59 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:08:26,475 epoch 9 - iter 1800/1809 - loss 0.00698359 - time (sec): 129.02 - samples/sec: 2933.27 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:08:27,089 ----------------------------------------------------------------------------------------------------
2023-10-17 17:08:27,090 EPOCH 9 done: loss 0.0070 - lr: 0.000003
2023-10-17 17:08:33,338 DEV : loss 0.38837048411369324 - f1-score (micro avg) 0.6705
2023-10-17 17:08:33,381 saving best model
2023-10-17 17:08:34,006 ----------------------------------------------------------------------------------------------------
2023-10-17 17:08:47,073 epoch 10 - iter 180/1809 - loss 0.00224797 - time (sec): 13.07 - samples/sec: 2881.57 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:09:01,239 epoch 10 - iter 360/1809 - loss 0.00362755 - time (sec): 27.23 - samples/sec: 2854.41 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:09:14,204 epoch 10 - iter 540/1809 - loss 0.00370153 - time (sec): 40.20 - samples/sec: 2863.08 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:09:26,807 epoch 10 - iter 720/1809 - loss 0.00450308 - time (sec): 52.80 - samples/sec: 2898.14 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:09:39,229 epoch 10 - iter 900/1809 - loss 0.00476154 - time (sec): 65.22 - samples/sec: 2896.81 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:09:52,036 epoch 10 - iter 1080/1809 - loss 0.00513412 - time (sec): 78.03 - samples/sec: 2917.16 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:10:04,882 epoch 10 - iter 1260/1809 - loss 0.00519172 - time (sec): 90.87 - samples/sec: 2926.77 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:10:17,583 epoch 10 - iter 1440/1809 - loss 0.00512814 - time (sec): 103.58 - samples/sec: 2929.95 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:10:30,625 epoch 10 - iter 1620/1809 - loss 0.00520500 - time (sec): 116.62 - samples/sec: 2936.59 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:10:43,601 epoch 10 - iter 1800/1809 - loss 0.00503373 - time (sec): 129.59 - samples/sec: 2921.20 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:10:44,249 ----------------------------------------------------------------------------------------------------
2023-10-17 17:10:44,249 EPOCH 10 done: loss 0.0050 - lr: 0.000000
2023-10-17 17:10:51,342 DEV : loss 0.397739440202713 - f1-score (micro avg) 0.6676
2023-10-17 17:10:51,878 ----------------------------------------------------------------------------------------------------
2023-10-17 17:10:51,880 Loading model from best epoch ...
2023-10-17 17:10:53,630 SequenceTagger predicts: Dictionary with 13 tags: O, S-loc, B-loc, E-loc, I-loc, S-pers, B-pers, E-pers, I-pers, S-org, B-org, E-org, I-org
2023-10-17 17:11:01,735
Results:
- F-score (micro) 0.6596
- F-score (macro) 0.5192
- Accuracy 0.5023
By class:
precision recall f1-score support
loc 0.6533 0.7970 0.7180 591
pers 0.5792 0.7479 0.6528 357
org 0.1972 0.1772 0.1867 79
micro avg 0.6002 0.7322 0.6596 1027
macro avg 0.4765 0.5740 0.5192 1027
weighted avg 0.5924 0.7322 0.6545 1027
2023-10-17 17:11:01,735 ----------------------------------------------------------------------------------------------------