2023-10-17 13:11:11,277 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:11:11,278 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 13:11:11,278 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:11:11,278 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-17 13:11:11,278 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:11:11,278 Train: 7142 sentences 2023-10-17 13:11:11,278 (train_with_dev=False, train_with_test=False) 2023-10-17 13:11:11,278 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:11:11,278 Training Params: 2023-10-17 13:11:11,278 - learning_rate: "3e-05" 2023-10-17 13:11:11,278 - mini_batch_size: "8" 2023-10-17 13:11:11,278 - max_epochs: "10" 2023-10-17 13:11:11,278 - shuffle: "True" 2023-10-17 13:11:11,278 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:11:11,278 Plugins: 2023-10-17 13:11:11,278 - TensorboardLogger 2023-10-17 13:11:11,278 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 13:11:11,278 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:11:11,278 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 13:11:11,279 - metric: "('micro avg', 'f1-score')" 2023-10-17 13:11:11,279 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:11:11,279 Computation: 2023-10-17 13:11:11,279 - compute on device: cuda:0 2023-10-17 13:11:11,279 - embedding storage: none 2023-10-17 13:11:11,279 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:11:11,279 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 13:11:11,279 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:11:11,279 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:11:11,279 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 13:11:17,729 epoch 1 - iter 89/893 - loss 3.17640818 - time (sec): 6.45 - samples/sec: 3567.94 - lr: 0.000003 - momentum: 0.000000 2023-10-17 13:11:25,226 epoch 1 - iter 178/893 - loss 1.90963577 - time (sec): 13.95 - samples/sec: 3560.95 - lr: 0.000006 - momentum: 0.000000 2023-10-17 13:11:32,086 epoch 1 - iter 267/893 - loss 1.43296538 - time (sec): 20.81 - samples/sec: 3582.18 - lr: 0.000009 - momentum: 0.000000 2023-10-17 13:11:39,075 epoch 1 - iter 356/893 - loss 1.15477705 - time (sec): 27.79 - samples/sec: 3637.17 - lr: 0.000012 - momentum: 0.000000 2023-10-17 13:11:45,734 epoch 1 - iter 445/893 - loss 0.98822597 - time (sec): 34.45 - samples/sec: 3622.67 - lr: 0.000015 - momentum: 0.000000 2023-10-17 13:11:52,343 epoch 1 - iter 534/893 - loss 0.87512641 - time (sec): 41.06 - samples/sec: 3607.98 - lr: 0.000018 - momentum: 0.000000 2023-10-17 13:11:59,231 epoch 1 - iter 623/893 - loss 0.77872879 - time (sec): 47.95 - samples/sec: 3603.86 - lr: 0.000021 - momentum: 0.000000 2023-10-17 13:12:06,773 epoch 1 - iter 712/893 - loss 0.69710433 - time (sec): 55.49 - samples/sec: 3590.45 - lr: 0.000024 - momentum: 0.000000 2023-10-17 13:12:13,497 epoch 1 - iter 801/893 - loss 0.64099988 - time (sec): 62.22 - samples/sec: 3581.84 - lr: 0.000027 - momentum: 0.000000 2023-10-17 13:12:20,024 epoch 1 - iter 890/893 - loss 0.59043670 - time (sec): 68.74 - samples/sec: 3610.20 - lr: 0.000030 - momentum: 0.000000 2023-10-17 13:12:20,188 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:12:20,188 EPOCH 1 done: loss 0.5894 - lr: 0.000030 2023-10-17 13:12:23,684 DEV : loss 0.1309102326631546 - f1-score (micro avg) 0.7109 2023-10-17 13:12:23,700 saving best model 2023-10-17 13:12:24,054 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:12:30,769 epoch 2 - iter 89/893 - loss 0.14316887 - time (sec): 6.71 - samples/sec: 3621.62 - lr: 0.000030 - momentum: 0.000000 2023-10-17 13:12:37,450 epoch 2 - iter 178/893 - loss 0.13029728 - time (sec): 13.40 - samples/sec: 3585.94 - lr: 0.000029 - momentum: 0.000000 2023-10-17 13:12:43,811 epoch 2 - iter 267/893 - loss 0.12564379 - time (sec): 19.76 - samples/sec: 3537.84 - lr: 0.000029 - momentum: 0.000000 2023-10-17 13:12:51,119 epoch 2 - iter 356/893 - loss 0.12243479 - time (sec): 27.06 - samples/sec: 3512.89 - lr: 0.000029 - momentum: 0.000000 2023-10-17 13:12:58,176 epoch 2 - iter 445/893 - loss 0.11791540 - time (sec): 34.12 - samples/sec: 3559.26 - lr: 0.000028 - momentum: 0.000000 2023-10-17 13:13:05,491 epoch 2 - iter 534/893 - loss 0.11598392 - time (sec): 41.44 - samples/sec: 3556.37 - lr: 0.000028 - momentum: 0.000000 2023-10-17 13:13:12,846 epoch 2 - iter 623/893 - loss 0.11205928 - time (sec): 48.79 - samples/sec: 3584.65 - lr: 0.000028 - momentum: 0.000000 2023-10-17 13:13:19,691 epoch 2 - iter 712/893 - loss 0.11279841 - time (sec): 55.64 - samples/sec: 3594.12 - lr: 0.000027 - momentum: 0.000000 2023-10-17 13:13:26,447 epoch 2 - iter 801/893 - loss 0.11134085 - time (sec): 62.39 - samples/sec: 3596.97 - lr: 0.000027 - momentum: 0.000000 2023-10-17 13:13:33,002 epoch 2 - iter 890/893 - loss 0.11060956 - time (sec): 68.95 - samples/sec: 3598.61 - lr: 0.000027 - momentum: 0.000000 2023-10-17 13:13:33,203 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:13:33,203 EPOCH 2 done: loss 0.1104 - lr: 0.000027 2023-10-17 13:13:37,905 DEV : loss 0.0951007828116417 - f1-score (micro avg) 0.781 2023-10-17 13:13:37,921 saving best model 2023-10-17 13:13:38,363 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:13:45,310 epoch 3 - iter 89/893 - loss 0.06009239 - time (sec): 6.94 - samples/sec: 3534.29 - lr: 0.000026 - momentum: 0.000000 2023-10-17 13:13:52,596 epoch 3 - iter 178/893 - loss 0.06163340 - time (sec): 14.23 - samples/sec: 3472.19 - lr: 0.000026 - momentum: 0.000000 2023-10-17 13:13:59,664 epoch 3 - iter 267/893 - loss 0.06211706 - time (sec): 21.30 - samples/sec: 3505.16 - lr: 0.000026 - momentum: 0.000000 2023-10-17 13:14:06,493 epoch 3 - iter 356/893 - loss 0.06319947 - time (sec): 28.13 - samples/sec: 3498.90 - lr: 0.000025 - momentum: 0.000000 2023-10-17 13:14:13,459 epoch 3 - iter 445/893 - loss 0.06493322 - time (sec): 35.09 - samples/sec: 3534.47 - lr: 0.000025 - momentum: 0.000000 2023-10-17 13:14:20,102 epoch 3 - iter 534/893 - loss 0.06735594 - time (sec): 41.73 - samples/sec: 3545.54 - lr: 0.000025 - momentum: 0.000000 2023-10-17 13:14:26,882 epoch 3 - iter 623/893 - loss 0.06931203 - time (sec): 48.51 - samples/sec: 3551.83 - lr: 0.000024 - momentum: 0.000000 2023-10-17 13:14:34,327 epoch 3 - iter 712/893 - loss 0.06862328 - time (sec): 55.96 - samples/sec: 3544.56 - lr: 0.000024 - momentum: 0.000000 2023-10-17 13:14:41,579 epoch 3 - iter 801/893 - loss 0.06871328 - time (sec): 63.21 - samples/sec: 3526.20 - lr: 0.000024 - momentum: 0.000000 2023-10-17 13:14:48,833 epoch 3 - iter 890/893 - loss 0.06879528 - time (sec): 70.47 - samples/sec: 3514.25 - lr: 0.000023 - momentum: 0.000000 2023-10-17 13:14:49,131 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:14:49,131 EPOCH 3 done: loss 0.0688 - lr: 0.000023 2023-10-17 13:14:53,406 DEV : loss 0.11377345025539398 - f1-score (micro avg) 0.7968 2023-10-17 13:14:53,428 saving best model 2023-10-17 13:14:53,990 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:15:01,105 epoch 4 - iter 89/893 - loss 0.04314675 - time (sec): 7.11 - samples/sec: 3634.63 - lr: 0.000023 - momentum: 0.000000 2023-10-17 13:15:08,005 epoch 4 - iter 178/893 - loss 0.04212018 - time (sec): 14.01 - samples/sec: 3602.63 - lr: 0.000023 - momentum: 0.000000 2023-10-17 13:15:14,836 epoch 4 - iter 267/893 - loss 0.04203874 - time (sec): 20.84 - samples/sec: 3648.77 - lr: 0.000022 - momentum: 0.000000 2023-10-17 13:15:22,695 epoch 4 - iter 356/893 - loss 0.04350451 - time (sec): 28.70 - samples/sec: 3552.98 - lr: 0.000022 - momentum: 0.000000 2023-10-17 13:15:29,418 epoch 4 - iter 445/893 - loss 0.04537159 - time (sec): 35.43 - samples/sec: 3550.76 - lr: 0.000022 - momentum: 0.000000 2023-10-17 13:15:36,766 epoch 4 - iter 534/893 - loss 0.04512834 - time (sec): 42.77 - samples/sec: 3533.91 - lr: 0.000021 - momentum: 0.000000 2023-10-17 13:15:43,643 epoch 4 - iter 623/893 - loss 0.04598418 - time (sec): 49.65 - samples/sec: 3548.38 - lr: 0.000021 - momentum: 0.000000 2023-10-17 13:15:50,025 epoch 4 - iter 712/893 - loss 0.04512219 - time (sec): 56.03 - samples/sec: 3565.22 - lr: 0.000021 - momentum: 0.000000 2023-10-17 13:15:56,938 epoch 4 - iter 801/893 - loss 0.04525561 - time (sec): 62.95 - samples/sec: 3559.74 - lr: 0.000020 - momentum: 0.000000 2023-10-17 13:16:03,783 epoch 4 - iter 890/893 - loss 0.04519123 - time (sec): 69.79 - samples/sec: 3550.95 - lr: 0.000020 - momentum: 0.000000 2023-10-17 13:16:04,000 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:16:04,001 EPOCH 4 done: loss 0.0452 - lr: 0.000020 2023-10-17 13:16:08,187 DEV : loss 0.13433651626110077 - f1-score (micro avg) 0.7974 2023-10-17 13:16:08,203 saving best model 2023-10-17 13:16:08,657 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:16:15,584 epoch 5 - iter 89/893 - loss 0.02663720 - time (sec): 6.92 - samples/sec: 3584.33 - lr: 0.000020 - momentum: 0.000000 2023-10-17 13:16:22,861 epoch 5 - iter 178/893 - loss 0.03457466 - time (sec): 14.20 - samples/sec: 3604.57 - lr: 0.000019 - momentum: 0.000000 2023-10-17 13:16:29,909 epoch 5 - iter 267/893 - loss 0.03692191 - time (sec): 21.25 - samples/sec: 3614.64 - lr: 0.000019 - momentum: 0.000000 2023-10-17 13:16:36,809 epoch 5 - iter 356/893 - loss 0.03391549 - time (sec): 28.15 - samples/sec: 3602.21 - lr: 0.000019 - momentum: 0.000000 2023-10-17 13:16:43,439 epoch 5 - iter 445/893 - loss 0.03368263 - time (sec): 34.78 - samples/sec: 3584.53 - lr: 0.000018 - momentum: 0.000000 2023-10-17 13:16:50,365 epoch 5 - iter 534/893 - loss 0.03337125 - time (sec): 41.70 - samples/sec: 3595.52 - lr: 0.000018 - momentum: 0.000000 2023-10-17 13:16:57,083 epoch 5 - iter 623/893 - loss 0.03331068 - time (sec): 48.42 - samples/sec: 3596.17 - lr: 0.000018 - momentum: 0.000000 2023-10-17 13:17:03,744 epoch 5 - iter 712/893 - loss 0.03340135 - time (sec): 55.08 - samples/sec: 3610.83 - lr: 0.000017 - momentum: 0.000000 2023-10-17 13:17:10,224 epoch 5 - iter 801/893 - loss 0.03399885 - time (sec): 61.56 - samples/sec: 3594.94 - lr: 0.000017 - momentum: 0.000000 2023-10-17 13:17:17,777 epoch 5 - iter 890/893 - loss 0.03500611 - time (sec): 69.11 - samples/sec: 3586.14 - lr: 0.000017 - momentum: 0.000000 2023-10-17 13:17:18,033 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:17:18,034 EPOCH 5 done: loss 0.0350 - lr: 0.000017 2023-10-17 13:17:23,075 DEV : loss 0.16414399445056915 - f1-score (micro avg) 0.8157 2023-10-17 13:17:23,101 saving best model 2023-10-17 13:17:23,652 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:17:30,901 epoch 6 - iter 89/893 - loss 0.02576213 - time (sec): 7.25 - samples/sec: 3458.60 - lr: 0.000016 - momentum: 0.000000 2023-10-17 13:17:38,138 epoch 6 - iter 178/893 - loss 0.02004470 - time (sec): 14.48 - samples/sec: 3540.98 - lr: 0.000016 - momentum: 0.000000 2023-10-17 13:17:45,045 epoch 6 - iter 267/893 - loss 0.02189040 - time (sec): 21.39 - samples/sec: 3511.63 - lr: 0.000016 - momentum: 0.000000 2023-10-17 13:17:51,973 epoch 6 - iter 356/893 - loss 0.02397224 - time (sec): 28.32 - samples/sec: 3522.26 - lr: 0.000015 - momentum: 0.000000 2023-10-17 13:17:58,788 epoch 6 - iter 445/893 - loss 0.02380844 - time (sec): 35.13 - samples/sec: 3521.38 - lr: 0.000015 - momentum: 0.000000 2023-10-17 13:18:05,526 epoch 6 - iter 534/893 - loss 0.02394223 - time (sec): 41.87 - samples/sec: 3532.23 - lr: 0.000015 - momentum: 0.000000 2023-10-17 13:18:12,521 epoch 6 - iter 623/893 - loss 0.02530797 - time (sec): 48.87 - samples/sec: 3535.13 - lr: 0.000014 - momentum: 0.000000 2023-10-17 13:18:19,655 epoch 6 - iter 712/893 - loss 0.02592809 - time (sec): 56.00 - samples/sec: 3545.81 - lr: 0.000014 - momentum: 0.000000 2023-10-17 13:18:26,481 epoch 6 - iter 801/893 - loss 0.02684141 - time (sec): 62.83 - samples/sec: 3545.87 - lr: 0.000014 - momentum: 0.000000 2023-10-17 13:18:33,294 epoch 6 - iter 890/893 - loss 0.02642829 - time (sec): 69.64 - samples/sec: 3559.08 - lr: 0.000013 - momentum: 0.000000 2023-10-17 13:18:33,524 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:18:33,524 EPOCH 6 done: loss 0.0264 - lr: 0.000013 2023-10-17 13:18:37,850 DEV : loss 0.18563830852508545 - f1-score (micro avg) 0.8172 2023-10-17 13:18:37,874 saving best model 2023-10-17 13:18:38,348 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:18:45,762 epoch 7 - iter 89/893 - loss 0.01817170 - time (sec): 7.41 - samples/sec: 3465.31 - lr: 0.000013 - momentum: 0.000000 2023-10-17 13:18:52,773 epoch 7 - iter 178/893 - loss 0.01818330 - time (sec): 14.42 - samples/sec: 3523.28 - lr: 0.000013 - momentum: 0.000000 2023-10-17 13:18:59,402 epoch 7 - iter 267/893 - loss 0.02201803 - time (sec): 21.05 - samples/sec: 3557.52 - lr: 0.000012 - momentum: 0.000000 2023-10-17 13:19:06,347 epoch 7 - iter 356/893 - loss 0.02072965 - time (sec): 28.00 - samples/sec: 3587.87 - lr: 0.000012 - momentum: 0.000000 2023-10-17 13:19:13,456 epoch 7 - iter 445/893 - loss 0.02090492 - time (sec): 35.11 - samples/sec: 3581.66 - lr: 0.000012 - momentum: 0.000000 2023-10-17 13:19:19,967 epoch 7 - iter 534/893 - loss 0.02201572 - time (sec): 41.62 - samples/sec: 3592.31 - lr: 0.000011 - momentum: 0.000000 2023-10-17 13:19:26,946 epoch 7 - iter 623/893 - loss 0.02108398 - time (sec): 48.60 - samples/sec: 3609.75 - lr: 0.000011 - momentum: 0.000000 2023-10-17 13:19:34,487 epoch 7 - iter 712/893 - loss 0.02078526 - time (sec): 56.14 - samples/sec: 3563.62 - lr: 0.000011 - momentum: 0.000000 2023-10-17 13:19:41,189 epoch 7 - iter 801/893 - loss 0.02103021 - time (sec): 62.84 - samples/sec: 3543.03 - lr: 0.000010 - momentum: 0.000000 2023-10-17 13:19:47,972 epoch 7 - iter 890/893 - loss 0.02114067 - time (sec): 69.62 - samples/sec: 3554.30 - lr: 0.000010 - momentum: 0.000000 2023-10-17 13:19:48,204 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:19:48,205 EPOCH 7 done: loss 0.0210 - lr: 0.000010 2023-10-17 13:19:52,385 DEV : loss 0.192277729511261 - f1-score (micro avg) 0.8134 2023-10-17 13:19:52,403 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:19:59,480 epoch 8 - iter 89/893 - loss 0.02147607 - time (sec): 7.08 - samples/sec: 3430.07 - lr: 0.000010 - momentum: 0.000000 2023-10-17 13:20:06,279 epoch 8 - iter 178/893 - loss 0.01787962 - time (sec): 13.88 - samples/sec: 3518.86 - lr: 0.000009 - momentum: 0.000000 2023-10-17 13:20:13,115 epoch 8 - iter 267/893 - loss 0.01667893 - time (sec): 20.71 - samples/sec: 3535.33 - lr: 0.000009 - momentum: 0.000000 2023-10-17 13:20:20,113 epoch 8 - iter 356/893 - loss 0.01791450 - time (sec): 27.71 - samples/sec: 3519.58 - lr: 0.000009 - momentum: 0.000000 2023-10-17 13:20:27,076 epoch 8 - iter 445/893 - loss 0.01791560 - time (sec): 34.67 - samples/sec: 3528.83 - lr: 0.000008 - momentum: 0.000000 2023-10-17 13:20:34,078 epoch 8 - iter 534/893 - loss 0.01715362 - time (sec): 41.67 - samples/sec: 3510.38 - lr: 0.000008 - momentum: 0.000000 2023-10-17 13:20:40,855 epoch 8 - iter 623/893 - loss 0.01679633 - time (sec): 48.45 - samples/sec: 3541.29 - lr: 0.000008 - momentum: 0.000000 2023-10-17 13:20:48,575 epoch 8 - iter 712/893 - loss 0.01677249 - time (sec): 56.17 - samples/sec: 3532.74 - lr: 0.000007 - momentum: 0.000000 2023-10-17 13:20:55,271 epoch 8 - iter 801/893 - loss 0.01655037 - time (sec): 62.87 - samples/sec: 3540.18 - lr: 0.000007 - momentum: 0.000000 2023-10-17 13:21:02,258 epoch 8 - iter 890/893 - loss 0.01633471 - time (sec): 69.85 - samples/sec: 3552.07 - lr: 0.000007 - momentum: 0.000000 2023-10-17 13:21:02,481 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:21:02,481 EPOCH 8 done: loss 0.0164 - lr: 0.000007 2023-10-17 13:21:07,282 DEV : loss 0.20144419372081757 - f1-score (micro avg) 0.8209 2023-10-17 13:21:07,298 saving best model 2023-10-17 13:21:07,742 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:21:15,272 epoch 9 - iter 89/893 - loss 0.01303715 - time (sec): 7.53 - samples/sec: 3392.08 - lr: 0.000006 - momentum: 0.000000 2023-10-17 13:21:22,230 epoch 9 - iter 178/893 - loss 0.01030507 - time (sec): 14.48 - samples/sec: 3520.60 - lr: 0.000006 - momentum: 0.000000 2023-10-17 13:21:29,195 epoch 9 - iter 267/893 - loss 0.01126361 - time (sec): 21.45 - samples/sec: 3507.37 - lr: 0.000006 - momentum: 0.000000 2023-10-17 13:21:35,689 epoch 9 - iter 356/893 - loss 0.01138572 - time (sec): 27.94 - samples/sec: 3539.60 - lr: 0.000005 - momentum: 0.000000 2023-10-17 13:21:42,527 epoch 9 - iter 445/893 - loss 0.01062717 - time (sec): 34.78 - samples/sec: 3552.52 - lr: 0.000005 - momentum: 0.000000 2023-10-17 13:21:49,326 epoch 9 - iter 534/893 - loss 0.01090070 - time (sec): 41.58 - samples/sec: 3559.21 - lr: 0.000005 - momentum: 0.000000 2023-10-17 13:21:56,039 epoch 9 - iter 623/893 - loss 0.01125373 - time (sec): 48.29 - samples/sec: 3544.39 - lr: 0.000004 - momentum: 0.000000 2023-10-17 13:22:03,001 epoch 9 - iter 712/893 - loss 0.01189391 - time (sec): 55.26 - samples/sec: 3549.75 - lr: 0.000004 - momentum: 0.000000 2023-10-17 13:22:10,157 epoch 9 - iter 801/893 - loss 0.01204849 - time (sec): 62.41 - samples/sec: 3549.36 - lr: 0.000004 - momentum: 0.000000 2023-10-17 13:22:17,175 epoch 9 - iter 890/893 - loss 0.01127410 - time (sec): 69.43 - samples/sec: 3575.12 - lr: 0.000003 - momentum: 0.000000 2023-10-17 13:22:17,350 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:22:17,350 EPOCH 9 done: loss 0.0112 - lr: 0.000003 2023-10-17 13:22:21,564 DEV : loss 0.21674306690692902 - f1-score (micro avg) 0.8196 2023-10-17 13:22:21,582 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:22:29,235 epoch 10 - iter 89/893 - loss 0.00452034 - time (sec): 7.65 - samples/sec: 3326.42 - lr: 0.000003 - momentum: 0.000000 2023-10-17 13:22:36,460 epoch 10 - iter 178/893 - loss 0.00602805 - time (sec): 14.88 - samples/sec: 3436.14 - lr: 0.000003 - momentum: 0.000000 2023-10-17 13:22:43,456 epoch 10 - iter 267/893 - loss 0.00834765 - time (sec): 21.87 - samples/sec: 3481.92 - lr: 0.000002 - momentum: 0.000000 2023-10-17 13:22:50,078 epoch 10 - iter 356/893 - loss 0.00830114 - time (sec): 28.49 - samples/sec: 3490.41 - lr: 0.000002 - momentum: 0.000000 2023-10-17 13:22:57,413 epoch 10 - iter 445/893 - loss 0.00846354 - time (sec): 35.83 - samples/sec: 3493.93 - lr: 0.000002 - momentum: 0.000000 2023-10-17 13:23:04,499 epoch 10 - iter 534/893 - loss 0.00808250 - time (sec): 42.92 - samples/sec: 3489.71 - lr: 0.000001 - momentum: 0.000000 2023-10-17 13:23:11,070 epoch 10 - iter 623/893 - loss 0.00835373 - time (sec): 49.49 - samples/sec: 3511.00 - lr: 0.000001 - momentum: 0.000000 2023-10-17 13:23:18,106 epoch 10 - iter 712/893 - loss 0.00815132 - time (sec): 56.52 - samples/sec: 3524.44 - lr: 0.000001 - momentum: 0.000000 2023-10-17 13:23:24,781 epoch 10 - iter 801/893 - loss 0.00867055 - time (sec): 63.20 - samples/sec: 3534.94 - lr: 0.000000 - momentum: 0.000000 2023-10-17 13:23:31,859 epoch 10 - iter 890/893 - loss 0.00890620 - time (sec): 70.28 - samples/sec: 3527.58 - lr: 0.000000 - momentum: 0.000000 2023-10-17 13:23:32,042 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:23:32,042 EPOCH 10 done: loss 0.0089 - lr: 0.000000 2023-10-17 13:23:36,394 DEV : loss 0.20970353484153748 - f1-score (micro avg) 0.8248 2023-10-17 13:23:36,412 saving best model 2023-10-17 13:23:37,223 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:23:37,225 Loading model from best epoch ... 2023-10-17 13:23:38,664 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 13:23:48,249 Results: - F-score (micro) 0.7162 - F-score (macro) 0.6457 - Accuracy 0.5728 By class: precision recall f1-score support LOC 0.7201 0.7306 0.7253 1095 PER 0.7897 0.7905 0.7901 1012 ORG 0.4731 0.5910 0.5255 357 HumanProd 0.4127 0.7879 0.5417 33 micro avg 0.6977 0.7357 0.7162 2497 macro avg 0.5989 0.7250 0.6457 2497 weighted avg 0.7089 0.7357 0.7206 2497 2023-10-17 13:23:48,249 ----------------------------------------------------------------------------------------------------