2023-10-17 15:09:20,533 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:20,535 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 15:09:20,535 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:20,535 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-17 15:09:20,535 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:20,535 Train: 7142 sentences 2023-10-17 15:09:20,535 (train_with_dev=False, train_with_test=False) 2023-10-17 15:09:20,535 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:20,535 Training Params: 2023-10-17 15:09:20,535 - learning_rate: "3e-05" 2023-10-17 15:09:20,535 - mini_batch_size: "8" 2023-10-17 15:09:20,535 - max_epochs: "10" 2023-10-17 15:09:20,535 - shuffle: "True" 2023-10-17 15:09:20,535 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:20,536 Plugins: 2023-10-17 15:09:20,536 - TensorboardLogger 2023-10-17 15:09:20,536 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 15:09:20,536 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:20,536 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 15:09:20,536 - metric: "('micro avg', 'f1-score')" 2023-10-17 15:09:20,536 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:20,536 Computation: 2023-10-17 15:09:20,536 - compute on device: cuda:0 2023-10-17 15:09:20,536 - embedding storage: none 2023-10-17 15:09:20,536 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:20,536 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 15:09:20,536 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:20,536 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:09:20,536 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 15:09:27,546 epoch 1 - iter 89/893 - loss 3.17468277 - time (sec): 7.01 - samples/sec: 3610.80 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:09:34,586 epoch 1 - iter 178/893 - loss 2.10197739 - time (sec): 14.05 - samples/sec: 3591.62 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:09:41,428 epoch 1 - iter 267/893 - loss 1.58385290 - time (sec): 20.89 - samples/sec: 3551.93 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:09:48,238 epoch 1 - iter 356/893 - loss 1.29295235 - time (sec): 27.70 - samples/sec: 3536.72 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:09:54,809 epoch 1 - iter 445/893 - loss 1.10083788 - time (sec): 34.27 - samples/sec: 3532.97 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:10:01,485 epoch 1 - iter 534/893 - loss 0.95582211 - time (sec): 40.95 - samples/sec: 3558.63 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:10:09,028 epoch 1 - iter 623/893 - loss 0.84747117 - time (sec): 48.49 - samples/sec: 3514.68 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:10:16,180 epoch 1 - iter 712/893 - loss 0.75381137 - time (sec): 55.64 - samples/sec: 3538.81 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:10:23,433 epoch 1 - iter 801/893 - loss 0.68340637 - time (sec): 62.90 - samples/sec: 3547.95 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:10:30,490 epoch 1 - iter 890/893 - loss 0.63034171 - time (sec): 69.95 - samples/sec: 3548.03 - lr: 0.000030 - momentum: 0.000000 2023-10-17 15:10:30,661 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:30,661 EPOCH 1 done: loss 0.6293 - lr: 0.000030 2023-10-17 15:10:33,451 DEV : loss 0.11842236667871475 - f1-score (micro avg) 0.7256 2023-10-17 15:10:33,469 saving best model 2023-10-17 15:10:33,816 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:10:41,206 epoch 2 - iter 89/893 - loss 0.12232714 - time (sec): 7.39 - samples/sec: 3748.60 - lr: 0.000030 - momentum: 0.000000 2023-10-17 15:10:48,052 epoch 2 - iter 178/893 - loss 0.11680102 - time (sec): 14.23 - samples/sec: 3631.43 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:10:54,953 epoch 2 - iter 267/893 - loss 0.11328328 - time (sec): 21.14 - samples/sec: 3618.39 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:11:01,904 epoch 2 - iter 356/893 - loss 0.11241774 - time (sec): 28.09 - samples/sec: 3575.71 - lr: 0.000029 - momentum: 0.000000 2023-10-17 15:11:08,351 epoch 2 - iter 445/893 - loss 0.11060263 - time (sec): 34.53 - samples/sec: 3583.85 - lr: 0.000028 - momentum: 0.000000 2023-10-17 15:11:15,113 epoch 2 - iter 534/893 - loss 0.11047951 - time (sec): 41.30 - samples/sec: 3576.47 - lr: 0.000028 - momentum: 0.000000 2023-10-17 15:11:22,245 epoch 2 - iter 623/893 - loss 0.11086841 - time (sec): 48.43 - samples/sec: 3549.93 - lr: 0.000028 - momentum: 0.000000 2023-10-17 15:11:29,735 epoch 2 - iter 712/893 - loss 0.10835554 - time (sec): 55.92 - samples/sec: 3542.40 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:11:36,607 epoch 2 - iter 801/893 - loss 0.10647861 - time (sec): 62.79 - samples/sec: 3542.10 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:11:44,089 epoch 2 - iter 890/893 - loss 0.10520436 - time (sec): 70.27 - samples/sec: 3533.00 - lr: 0.000027 - momentum: 0.000000 2023-10-17 15:11:44,268 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:11:44,268 EPOCH 2 done: loss 0.1053 - lr: 0.000027 2023-10-17 15:11:49,317 DEV : loss 0.10727142542600632 - f1-score (micro avg) 0.7891 2023-10-17 15:11:49,336 saving best model 2023-10-17 15:11:49,791 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:11:56,578 epoch 3 - iter 89/893 - loss 0.07446046 - time (sec): 6.79 - samples/sec: 3584.31 - lr: 0.000026 - momentum: 0.000000 2023-10-17 15:12:03,419 epoch 3 - iter 178/893 - loss 0.07112677 - time (sec): 13.63 - samples/sec: 3654.95 - lr: 0.000026 - momentum: 0.000000 2023-10-17 15:12:11,006 epoch 3 - iter 267/893 - loss 0.06742536 - time (sec): 21.21 - samples/sec: 3644.25 - lr: 0.000026 - momentum: 0.000000 2023-10-17 15:12:17,869 epoch 3 - iter 356/893 - loss 0.06600880 - time (sec): 28.08 - samples/sec: 3630.84 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:12:24,959 epoch 3 - iter 445/893 - loss 0.06825375 - time (sec): 35.17 - samples/sec: 3645.98 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:12:31,281 epoch 3 - iter 534/893 - loss 0.06844461 - time (sec): 41.49 - samples/sec: 3627.52 - lr: 0.000025 - momentum: 0.000000 2023-10-17 15:12:37,870 epoch 3 - iter 623/893 - loss 0.06742613 - time (sec): 48.08 - samples/sec: 3612.75 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:12:45,061 epoch 3 - iter 712/893 - loss 0.06721754 - time (sec): 55.27 - samples/sec: 3605.20 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:12:52,659 epoch 3 - iter 801/893 - loss 0.06717275 - time (sec): 62.87 - samples/sec: 3579.35 - lr: 0.000024 - momentum: 0.000000 2023-10-17 15:12:59,217 epoch 3 - iter 890/893 - loss 0.06707278 - time (sec): 69.42 - samples/sec: 3571.54 - lr: 0.000023 - momentum: 0.000000 2023-10-17 15:12:59,452 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:12:59,452 EPOCH 3 done: loss 0.0670 - lr: 0.000023 2023-10-17 15:13:04,383 DEV : loss 0.1304038017988205 - f1-score (micro avg) 0.7963 2023-10-17 15:13:04,400 saving best model 2023-10-17 15:13:04,853 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:13:12,174 epoch 4 - iter 89/893 - loss 0.04868423 - time (sec): 7.32 - samples/sec: 3502.35 - lr: 0.000023 - momentum: 0.000000 2023-10-17 15:13:19,287 epoch 4 - iter 178/893 - loss 0.04241356 - time (sec): 14.43 - samples/sec: 3520.55 - lr: 0.000023 - momentum: 0.000000 2023-10-17 15:13:26,239 epoch 4 - iter 267/893 - loss 0.04403277 - time (sec): 21.38 - samples/sec: 3544.00 - lr: 0.000022 - momentum: 0.000000 2023-10-17 15:13:32,878 epoch 4 - iter 356/893 - loss 0.04443889 - time (sec): 28.02 - samples/sec: 3549.20 - lr: 0.000022 - momentum: 0.000000 2023-10-17 15:13:39,966 epoch 4 - iter 445/893 - loss 0.04684561 - time (sec): 35.11 - samples/sec: 3518.63 - lr: 0.000022 - momentum: 0.000000 2023-10-17 15:13:46,763 epoch 4 - iter 534/893 - loss 0.04726451 - time (sec): 41.91 - samples/sec: 3522.77 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:13:54,015 epoch 4 - iter 623/893 - loss 0.04633020 - time (sec): 49.16 - samples/sec: 3525.55 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:14:01,146 epoch 4 - iter 712/893 - loss 0.04752612 - time (sec): 56.29 - samples/sec: 3522.62 - lr: 0.000021 - momentum: 0.000000 2023-10-17 15:14:08,348 epoch 4 - iter 801/893 - loss 0.04742352 - time (sec): 63.49 - samples/sec: 3521.82 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:14:15,272 epoch 4 - iter 890/893 - loss 0.04712026 - time (sec): 70.42 - samples/sec: 3519.42 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:14:15,544 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:14:15,544 EPOCH 4 done: loss 0.0472 - lr: 0.000020 2023-10-17 15:14:19,791 DEV : loss 0.1480644792318344 - f1-score (micro avg) 0.822 2023-10-17 15:14:19,808 saving best model 2023-10-17 15:14:20,262 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:14:27,481 epoch 5 - iter 89/893 - loss 0.02697342 - time (sec): 7.21 - samples/sec: 3458.11 - lr: 0.000020 - momentum: 0.000000 2023-10-17 15:14:34,191 epoch 5 - iter 178/893 - loss 0.02931270 - time (sec): 13.92 - samples/sec: 3520.13 - lr: 0.000019 - momentum: 0.000000 2023-10-17 15:14:41,111 epoch 5 - iter 267/893 - loss 0.03360478 - time (sec): 20.84 - samples/sec: 3527.20 - lr: 0.000019 - momentum: 0.000000 2023-10-17 15:14:47,741 epoch 5 - iter 356/893 - loss 0.03435094 - time (sec): 27.48 - samples/sec: 3527.12 - lr: 0.000019 - momentum: 0.000000 2023-10-17 15:14:55,013 epoch 5 - iter 445/893 - loss 0.03355795 - time (sec): 34.75 - samples/sec: 3491.76 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:15:02,081 epoch 5 - iter 534/893 - loss 0.03507889 - time (sec): 41.81 - samples/sec: 3504.28 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:15:09,134 epoch 5 - iter 623/893 - loss 0.03463081 - time (sec): 48.87 - samples/sec: 3519.94 - lr: 0.000018 - momentum: 0.000000 2023-10-17 15:15:16,208 epoch 5 - iter 712/893 - loss 0.03474055 - time (sec): 55.94 - samples/sec: 3524.52 - lr: 0.000017 - momentum: 0.000000 2023-10-17 15:15:23,557 epoch 5 - iter 801/893 - loss 0.03527026 - time (sec): 63.29 - samples/sec: 3527.17 - lr: 0.000017 - momentum: 0.000000 2023-10-17 15:15:30,470 epoch 5 - iter 890/893 - loss 0.03498960 - time (sec): 70.20 - samples/sec: 3534.99 - lr: 0.000017 - momentum: 0.000000 2023-10-17 15:15:30,640 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:15:30,640 EPOCH 5 done: loss 0.0349 - lr: 0.000017 2023-10-17 15:15:35,393 DEV : loss 0.15961149334907532 - f1-score (micro avg) 0.8035 2023-10-17 15:15:35,410 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:15:42,408 epoch 6 - iter 89/893 - loss 0.02420361 - time (sec): 7.00 - samples/sec: 3548.22 - lr: 0.000016 - momentum: 0.000000 2023-10-17 15:15:48,903 epoch 6 - iter 178/893 - loss 0.02406819 - time (sec): 13.49 - samples/sec: 3543.80 - lr: 0.000016 - momentum: 0.000000 2023-10-17 15:15:56,114 epoch 6 - iter 267/893 - loss 0.02420778 - time (sec): 20.70 - samples/sec: 3530.79 - lr: 0.000016 - momentum: 0.000000 2023-10-17 15:16:03,630 epoch 6 - iter 356/893 - loss 0.02473679 - time (sec): 28.22 - samples/sec: 3491.53 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:16:10,515 epoch 6 - iter 445/893 - loss 0.02601161 - time (sec): 35.10 - samples/sec: 3511.49 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:16:17,643 epoch 6 - iter 534/893 - loss 0.02577712 - time (sec): 42.23 - samples/sec: 3540.32 - lr: 0.000015 - momentum: 0.000000 2023-10-17 15:16:24,783 epoch 6 - iter 623/893 - loss 0.02608412 - time (sec): 49.37 - samples/sec: 3539.28 - lr: 0.000014 - momentum: 0.000000 2023-10-17 15:16:31,686 epoch 6 - iter 712/893 - loss 0.02665696 - time (sec): 56.27 - samples/sec: 3545.51 - lr: 0.000014 - momentum: 0.000000 2023-10-17 15:16:38,520 epoch 6 - iter 801/893 - loss 0.02745268 - time (sec): 63.11 - samples/sec: 3553.56 - lr: 0.000014 - momentum: 0.000000 2023-10-17 15:16:45,563 epoch 6 - iter 890/893 - loss 0.02802573 - time (sec): 70.15 - samples/sec: 3535.40 - lr: 0.000013 - momentum: 0.000000 2023-10-17 15:16:45,762 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:16:45,762 EPOCH 6 done: loss 0.0280 - lr: 0.000013 2023-10-17 15:16:50,006 DEV : loss 0.17348243296146393 - f1-score (micro avg) 0.8118 2023-10-17 15:16:50,025 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:16:57,708 epoch 7 - iter 89/893 - loss 0.01778190 - time (sec): 7.68 - samples/sec: 3402.72 - lr: 0.000013 - momentum: 0.000000 2023-10-17 15:17:04,684 epoch 7 - iter 178/893 - loss 0.01956122 - time (sec): 14.66 - samples/sec: 3453.95 - lr: 0.000013 - momentum: 0.000000 2023-10-17 15:17:11,587 epoch 7 - iter 267/893 - loss 0.01856255 - time (sec): 21.56 - samples/sec: 3439.09 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:17:18,586 epoch 7 - iter 356/893 - loss 0.01988639 - time (sec): 28.56 - samples/sec: 3483.76 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:17:25,653 epoch 7 - iter 445/893 - loss 0.01938792 - time (sec): 35.63 - samples/sec: 3491.34 - lr: 0.000012 - momentum: 0.000000 2023-10-17 15:17:32,453 epoch 7 - iter 534/893 - loss 0.02095563 - time (sec): 42.43 - samples/sec: 3513.52 - lr: 0.000011 - momentum: 0.000000 2023-10-17 15:17:39,259 epoch 7 - iter 623/893 - loss 0.02165179 - time (sec): 49.23 - samples/sec: 3520.41 - lr: 0.000011 - momentum: 0.000000 2023-10-17 15:17:46,264 epoch 7 - iter 712/893 - loss 0.02144328 - time (sec): 56.24 - samples/sec: 3506.66 - lr: 0.000011 - momentum: 0.000000 2023-10-17 15:17:53,665 epoch 7 - iter 801/893 - loss 0.02112030 - time (sec): 63.64 - samples/sec: 3505.50 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:18:00,568 epoch 7 - iter 890/893 - loss 0.02096110 - time (sec): 70.54 - samples/sec: 3519.21 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:18:00,782 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:18:00,782 EPOCH 7 done: loss 0.0211 - lr: 0.000010 2023-10-17 15:18:05,006 DEV : loss 0.19632981717586517 - f1-score (micro avg) 0.8309 2023-10-17 15:18:05,022 saving best model 2023-10-17 15:18:05,530 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:18:12,423 epoch 8 - iter 89/893 - loss 0.01429665 - time (sec): 6.89 - samples/sec: 3489.66 - lr: 0.000010 - momentum: 0.000000 2023-10-17 15:18:19,109 epoch 8 - iter 178/893 - loss 0.01707880 - time (sec): 13.58 - samples/sec: 3533.97 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:18:26,443 epoch 8 - iter 267/893 - loss 0.01664126 - time (sec): 20.91 - samples/sec: 3503.31 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:18:33,523 epoch 8 - iter 356/893 - loss 0.01557798 - time (sec): 27.99 - samples/sec: 3547.51 - lr: 0.000009 - momentum: 0.000000 2023-10-17 15:18:41,081 epoch 8 - iter 445/893 - loss 0.01559462 - time (sec): 35.55 - samples/sec: 3562.24 - lr: 0.000008 - momentum: 0.000000 2023-10-17 15:18:48,096 epoch 8 - iter 534/893 - loss 0.01483953 - time (sec): 42.56 - samples/sec: 3576.21 - lr: 0.000008 - momentum: 0.000000 2023-10-17 15:18:55,230 epoch 8 - iter 623/893 - loss 0.01549873 - time (sec): 49.70 - samples/sec: 3559.73 - lr: 0.000008 - momentum: 0.000000 2023-10-17 15:19:02,083 epoch 8 - iter 712/893 - loss 0.01562235 - time (sec): 56.55 - samples/sec: 3547.03 - lr: 0.000007 - momentum: 0.000000 2023-10-17 15:19:08,926 epoch 8 - iter 801/893 - loss 0.01515266 - time (sec): 63.39 - samples/sec: 3543.06 - lr: 0.000007 - momentum: 0.000000 2023-10-17 15:19:15,547 epoch 8 - iter 890/893 - loss 0.01524696 - time (sec): 70.01 - samples/sec: 3538.74 - lr: 0.000007 - momentum: 0.000000 2023-10-17 15:19:15,834 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:19:15,834 EPOCH 8 done: loss 0.0152 - lr: 0.000007 2023-10-17 15:19:21,293 DEV : loss 0.20254144072532654 - f1-score (micro avg) 0.8268 2023-10-17 15:19:21,322 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:19:28,171 epoch 9 - iter 89/893 - loss 0.01120597 - time (sec): 6.85 - samples/sec: 3518.61 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:19:34,750 epoch 9 - iter 178/893 - loss 0.01273785 - time (sec): 13.43 - samples/sec: 3581.27 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:19:41,616 epoch 9 - iter 267/893 - loss 0.01274289 - time (sec): 20.29 - samples/sec: 3581.05 - lr: 0.000006 - momentum: 0.000000 2023-10-17 15:19:48,297 epoch 9 - iter 356/893 - loss 0.01222354 - time (sec): 26.97 - samples/sec: 3590.32 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:19:55,627 epoch 9 - iter 445/893 - loss 0.01162771 - time (sec): 34.30 - samples/sec: 3585.72 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:20:02,456 epoch 9 - iter 534/893 - loss 0.01219240 - time (sec): 41.13 - samples/sec: 3617.29 - lr: 0.000005 - momentum: 0.000000 2023-10-17 15:20:09,469 epoch 9 - iter 623/893 - loss 0.01234197 - time (sec): 48.15 - samples/sec: 3586.34 - lr: 0.000004 - momentum: 0.000000 2023-10-17 15:20:16,331 epoch 9 - iter 712/893 - loss 0.01245901 - time (sec): 55.01 - samples/sec: 3593.61 - lr: 0.000004 - momentum: 0.000000 2023-10-17 15:20:23,343 epoch 9 - iter 801/893 - loss 0.01235256 - time (sec): 62.02 - samples/sec: 3585.82 - lr: 0.000004 - momentum: 0.000000 2023-10-17 15:20:31,178 epoch 9 - iter 890/893 - loss 0.01174338 - time (sec): 69.85 - samples/sec: 3547.85 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:20:31,426 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:31,426 EPOCH 9 done: loss 0.0117 - lr: 0.000003 2023-10-17 15:20:35,866 DEV : loss 0.21009869873523712 - f1-score (micro avg) 0.8231 2023-10-17 15:20:35,892 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:20:44,383 epoch 10 - iter 89/893 - loss 0.01168285 - time (sec): 8.49 - samples/sec: 2852.25 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:20:51,070 epoch 10 - iter 178/893 - loss 0.00956698 - time (sec): 15.18 - samples/sec: 3187.54 - lr: 0.000003 - momentum: 0.000000 2023-10-17 15:20:58,430 epoch 10 - iter 267/893 - loss 0.00810400 - time (sec): 22.54 - samples/sec: 3286.46 - lr: 0.000002 - momentum: 0.000000 2023-10-17 15:21:05,344 epoch 10 - iter 356/893 - loss 0.00938662 - time (sec): 29.45 - samples/sec: 3290.52 - lr: 0.000002 - momentum: 0.000000 2023-10-17 15:21:12,268 epoch 10 - iter 445/893 - loss 0.00895064 - time (sec): 36.37 - samples/sec: 3304.60 - lr: 0.000002 - momentum: 0.000000 2023-10-17 15:21:19,480 epoch 10 - iter 534/893 - loss 0.00957070 - time (sec): 43.58 - samples/sec: 3335.18 - lr: 0.000001 - momentum: 0.000000 2023-10-17 15:21:26,763 epoch 10 - iter 623/893 - loss 0.00910774 - time (sec): 50.87 - samples/sec: 3355.39 - lr: 0.000001 - momentum: 0.000000 2023-10-17 15:21:33,995 epoch 10 - iter 712/893 - loss 0.00887038 - time (sec): 58.10 - samples/sec: 3359.76 - lr: 0.000001 - momentum: 0.000000 2023-10-17 15:21:41,182 epoch 10 - iter 801/893 - loss 0.00857775 - time (sec): 65.29 - samples/sec: 3378.07 - lr: 0.000000 - momentum: 0.000000 2023-10-17 15:21:48,658 epoch 10 - iter 890/893 - loss 0.00850610 - time (sec): 72.76 - samples/sec: 3408.45 - lr: 0.000000 - momentum: 0.000000 2023-10-17 15:21:48,903 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:21:48,903 EPOCH 10 done: loss 0.0085 - lr: 0.000000 2023-10-17 15:21:53,236 DEV : loss 0.207401305437088 - f1-score (micro avg) 0.8282 2023-10-17 15:21:53,620 ---------------------------------------------------------------------------------------------------- 2023-10-17 15:21:53,622 Loading model from best epoch ... 2023-10-17 15:21:54,979 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 15:22:05,756 Results: - F-score (micro) 0.7185 - F-score (macro) 0.6326 - Accuracy 0.5744 By class: precision recall f1-score support LOC 0.7239 0.7397 0.7317 1095 PER 0.7950 0.7816 0.7882 1012 ORG 0.4939 0.5630 0.5262 357 HumanProd 0.3710 0.6970 0.4842 33 micro avg 0.7065 0.7309 0.7185 2497 macro avg 0.5959 0.6953 0.6326 2497 weighted avg 0.7151 0.7309 0.7220 2497 2023-10-17 15:22:05,756 ----------------------------------------------------------------------------------------------------