stefan-it's picture
Upload ./training.log with huggingface_hub
de9f060
2023-10-25 21:16:52,404 ----------------------------------------------------------------------------------------------------
2023-10-25 21:16:52,405 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 21:16:52,405 ----------------------------------------------------------------------------------------------------
2023-10-25 21:16:52,405 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-25 21:16:52,405 ----------------------------------------------------------------------------------------------------
2023-10-25 21:16:52,405 Train: 1085 sentences
2023-10-25 21:16:52,405 (train_with_dev=False, train_with_test=False)
2023-10-25 21:16:52,405 ----------------------------------------------------------------------------------------------------
2023-10-25 21:16:52,405 Training Params:
2023-10-25 21:16:52,405 - learning_rate: "5e-05"
2023-10-25 21:16:52,405 - mini_batch_size: "4"
2023-10-25 21:16:52,405 - max_epochs: "10"
2023-10-25 21:16:52,405 - shuffle: "True"
2023-10-25 21:16:52,405 ----------------------------------------------------------------------------------------------------
2023-10-25 21:16:52,405 Plugins:
2023-10-25 21:16:52,405 - TensorboardLogger
2023-10-25 21:16:52,405 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 21:16:52,405 ----------------------------------------------------------------------------------------------------
2023-10-25 21:16:52,405 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 21:16:52,405 - metric: "('micro avg', 'f1-score')"
2023-10-25 21:16:52,405 ----------------------------------------------------------------------------------------------------
2023-10-25 21:16:52,405 Computation:
2023-10-25 21:16:52,406 - compute on device: cuda:0
2023-10-25 21:16:52,406 - embedding storage: none
2023-10-25 21:16:52,406 ----------------------------------------------------------------------------------------------------
2023-10-25 21:16:52,406 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-25 21:16:52,406 ----------------------------------------------------------------------------------------------------
2023-10-25 21:16:52,406 ----------------------------------------------------------------------------------------------------
2023-10-25 21:16:52,406 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 21:16:54,002 epoch 1 - iter 27/272 - loss 2.73511101 - time (sec): 1.60 - samples/sec: 3794.26 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:16:55,455 epoch 1 - iter 54/272 - loss 1.94519428 - time (sec): 3.05 - samples/sec: 3629.33 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:16:57,006 epoch 1 - iter 81/272 - loss 1.57460458 - time (sec): 4.60 - samples/sec: 3365.47 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:16:58,543 epoch 1 - iter 108/272 - loss 1.29419222 - time (sec): 6.14 - samples/sec: 3339.84 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:17:00,067 epoch 1 - iter 135/272 - loss 1.10694561 - time (sec): 7.66 - samples/sec: 3289.62 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:17:01,650 epoch 1 - iter 162/272 - loss 0.94631686 - time (sec): 9.24 - samples/sec: 3336.01 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:17:03,210 epoch 1 - iter 189/272 - loss 0.83019629 - time (sec): 10.80 - samples/sec: 3385.74 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:17:04,765 epoch 1 - iter 216/272 - loss 0.74771962 - time (sec): 12.36 - samples/sec: 3387.51 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:17:06,315 epoch 1 - iter 243/272 - loss 0.69042563 - time (sec): 13.91 - samples/sec: 3346.12 - lr: 0.000044 - momentum: 0.000000
2023-10-25 21:17:07,824 epoch 1 - iter 270/272 - loss 0.63255195 - time (sec): 15.42 - samples/sec: 3364.57 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:17:07,926 ----------------------------------------------------------------------------------------------------
2023-10-25 21:17:07,926 EPOCH 1 done: loss 0.6322 - lr: 0.000049
2023-10-25 21:17:08,632 DEV : loss 0.13009829819202423 - f1-score (micro avg) 0.7173
2023-10-25 21:17:08,638 saving best model
2023-10-25 21:17:09,112 ----------------------------------------------------------------------------------------------------
2023-10-25 21:17:10,637 epoch 2 - iter 27/272 - loss 0.13866994 - time (sec): 1.52 - samples/sec: 3615.71 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:17:12,103 epoch 2 - iter 54/272 - loss 0.13389945 - time (sec): 2.99 - samples/sec: 3410.61 - lr: 0.000049 - momentum: 0.000000
2023-10-25 21:17:13,617 epoch 2 - iter 81/272 - loss 0.14062734 - time (sec): 4.50 - samples/sec: 3333.18 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:17:15,200 epoch 2 - iter 108/272 - loss 0.14321239 - time (sec): 6.09 - samples/sec: 3358.82 - lr: 0.000048 - momentum: 0.000000
2023-10-25 21:17:16,776 epoch 2 - iter 135/272 - loss 0.15486460 - time (sec): 7.66 - samples/sec: 3395.19 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:17:18,379 epoch 2 - iter 162/272 - loss 0.14631541 - time (sec): 9.26 - samples/sec: 3379.73 - lr: 0.000047 - momentum: 0.000000
2023-10-25 21:17:19,943 epoch 2 - iter 189/272 - loss 0.13556396 - time (sec): 10.83 - samples/sec: 3419.67 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:17:21,475 epoch 2 - iter 216/272 - loss 0.13296855 - time (sec): 12.36 - samples/sec: 3426.75 - lr: 0.000046 - momentum: 0.000000
2023-10-25 21:17:22,992 epoch 2 - iter 243/272 - loss 0.13035999 - time (sec): 13.88 - samples/sec: 3405.22 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:17:24,519 epoch 2 - iter 270/272 - loss 0.12823982 - time (sec): 15.40 - samples/sec: 3344.32 - lr: 0.000045 - momentum: 0.000000
2023-10-25 21:17:24,625 ----------------------------------------------------------------------------------------------------
2023-10-25 21:17:24,625 EPOCH 2 done: loss 0.1271 - lr: 0.000045
2023-10-25 21:17:26,305 DEV : loss 0.12012875825166702 - f1-score (micro avg) 0.7755
2023-10-25 21:17:26,312 saving best model
2023-10-25 21:17:26,974 ----------------------------------------------------------------------------------------------------
2023-10-25 21:17:28,496 epoch 3 - iter 27/272 - loss 0.05489200 - time (sec): 1.52 - samples/sec: 3451.25 - lr: 0.000044 - momentum: 0.000000
2023-10-25 21:17:30,027 epoch 3 - iter 54/272 - loss 0.05237056 - time (sec): 3.05 - samples/sec: 3163.33 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:17:31,543 epoch 3 - iter 81/272 - loss 0.04896805 - time (sec): 4.57 - samples/sec: 3251.03 - lr: 0.000043 - momentum: 0.000000
2023-10-25 21:17:33,047 epoch 3 - iter 108/272 - loss 0.05720601 - time (sec): 6.07 - samples/sec: 3167.80 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:17:34,582 epoch 3 - iter 135/272 - loss 0.06017766 - time (sec): 7.61 - samples/sec: 3239.90 - lr: 0.000042 - momentum: 0.000000
2023-10-25 21:17:36,066 epoch 3 - iter 162/272 - loss 0.06582158 - time (sec): 9.09 - samples/sec: 3282.43 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:17:37,640 epoch 3 - iter 189/272 - loss 0.06333105 - time (sec): 10.66 - samples/sec: 3318.01 - lr: 0.000041 - momentum: 0.000000
2023-10-25 21:17:39,236 epoch 3 - iter 216/272 - loss 0.06358727 - time (sec): 12.26 - samples/sec: 3353.15 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:17:40,822 epoch 3 - iter 243/272 - loss 0.06743678 - time (sec): 13.85 - samples/sec: 3390.62 - lr: 0.000040 - momentum: 0.000000
2023-10-25 21:17:42,290 epoch 3 - iter 270/272 - loss 0.06746195 - time (sec): 15.31 - samples/sec: 3375.19 - lr: 0.000039 - momentum: 0.000000
2023-10-25 21:17:42,389 ----------------------------------------------------------------------------------------------------
2023-10-25 21:17:42,389 EPOCH 3 done: loss 0.0678 - lr: 0.000039
2023-10-25 21:17:43,576 DEV : loss 0.13368070125579834 - f1-score (micro avg) 0.7856
2023-10-25 21:17:43,583 saving best model
2023-10-25 21:17:44,277 ----------------------------------------------------------------------------------------------------
2023-10-25 21:17:45,762 epoch 4 - iter 27/272 - loss 0.02597351 - time (sec): 1.48 - samples/sec: 3438.51 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:17:47,241 epoch 4 - iter 54/272 - loss 0.03908671 - time (sec): 2.96 - samples/sec: 3412.08 - lr: 0.000038 - momentum: 0.000000
2023-10-25 21:17:48,764 epoch 4 - iter 81/272 - loss 0.04178419 - time (sec): 4.49 - samples/sec: 3459.41 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:17:50,232 epoch 4 - iter 108/272 - loss 0.04031066 - time (sec): 5.95 - samples/sec: 3453.62 - lr: 0.000037 - momentum: 0.000000
2023-10-25 21:17:51,669 epoch 4 - iter 135/272 - loss 0.04375782 - time (sec): 7.39 - samples/sec: 3397.84 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:17:53,165 epoch 4 - iter 162/272 - loss 0.04117904 - time (sec): 8.89 - samples/sec: 3448.55 - lr: 0.000036 - momentum: 0.000000
2023-10-25 21:17:54,651 epoch 4 - iter 189/272 - loss 0.04053027 - time (sec): 10.37 - samples/sec: 3436.56 - lr: 0.000035 - momentum: 0.000000
2023-10-25 21:17:56,177 epoch 4 - iter 216/272 - loss 0.04161959 - time (sec): 11.90 - samples/sec: 3484.43 - lr: 0.000034 - momentum: 0.000000
2023-10-25 21:17:57,662 epoch 4 - iter 243/272 - loss 0.04033603 - time (sec): 13.38 - samples/sec: 3500.81 - lr: 0.000034 - momentum: 0.000000
2023-10-25 21:17:59,154 epoch 4 - iter 270/272 - loss 0.04150719 - time (sec): 14.88 - samples/sec: 3481.84 - lr: 0.000033 - momentum: 0.000000
2023-10-25 21:17:59,263 ----------------------------------------------------------------------------------------------------
2023-10-25 21:17:59,264 EPOCH 4 done: loss 0.0414 - lr: 0.000033
2023-10-25 21:18:00,579 DEV : loss 0.16597497463226318 - f1-score (micro avg) 0.7477
2023-10-25 21:18:00,586 ----------------------------------------------------------------------------------------------------
2023-10-25 21:18:02,148 epoch 5 - iter 27/272 - loss 0.02656488 - time (sec): 1.56 - samples/sec: 3420.72 - lr: 0.000033 - momentum: 0.000000
2023-10-25 21:18:03,672 epoch 5 - iter 54/272 - loss 0.03046942 - time (sec): 3.08 - samples/sec: 3250.62 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:18:05,134 epoch 5 - iter 81/272 - loss 0.02530309 - time (sec): 4.55 - samples/sec: 3316.03 - lr: 0.000032 - momentum: 0.000000
2023-10-25 21:18:06,608 epoch 5 - iter 108/272 - loss 0.03298758 - time (sec): 6.02 - samples/sec: 3322.10 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:18:08,118 epoch 5 - iter 135/272 - loss 0.02852542 - time (sec): 7.53 - samples/sec: 3390.72 - lr: 0.000031 - momentum: 0.000000
2023-10-25 21:18:09,598 epoch 5 - iter 162/272 - loss 0.02918451 - time (sec): 9.01 - samples/sec: 3387.66 - lr: 0.000030 - momentum: 0.000000
2023-10-25 21:18:11,085 epoch 5 - iter 189/272 - loss 0.02740291 - time (sec): 10.50 - samples/sec: 3304.99 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:18:12,583 epoch 5 - iter 216/272 - loss 0.02813313 - time (sec): 12.00 - samples/sec: 3345.46 - lr: 0.000029 - momentum: 0.000000
2023-10-25 21:18:14,147 epoch 5 - iter 243/272 - loss 0.03233798 - time (sec): 13.56 - samples/sec: 3404.86 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:18:15,739 epoch 5 - iter 270/272 - loss 0.03072248 - time (sec): 15.15 - samples/sec: 3414.00 - lr: 0.000028 - momentum: 0.000000
2023-10-25 21:18:15,853 ----------------------------------------------------------------------------------------------------
2023-10-25 21:18:15,854 EPOCH 5 done: loss 0.0306 - lr: 0.000028
2023-10-25 21:18:17,134 DEV : loss 0.14486266672611237 - f1-score (micro avg) 0.8346
2023-10-25 21:18:17,141 saving best model
2023-10-25 21:18:17,813 ----------------------------------------------------------------------------------------------------
2023-10-25 21:18:19,374 epoch 6 - iter 27/272 - loss 0.01515445 - time (sec): 1.56 - samples/sec: 3975.21 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:18:20,932 epoch 6 - iter 54/272 - loss 0.01925765 - time (sec): 3.12 - samples/sec: 3807.21 - lr: 0.000027 - momentum: 0.000000
2023-10-25 21:18:22,422 epoch 6 - iter 81/272 - loss 0.01952897 - time (sec): 4.61 - samples/sec: 3499.39 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:18:23,949 epoch 6 - iter 108/272 - loss 0.02282516 - time (sec): 6.13 - samples/sec: 3440.84 - lr: 0.000026 - momentum: 0.000000
2023-10-25 21:18:25,451 epoch 6 - iter 135/272 - loss 0.02150874 - time (sec): 7.63 - samples/sec: 3381.01 - lr: 0.000025 - momentum: 0.000000
2023-10-25 21:18:26,872 epoch 6 - iter 162/272 - loss 0.02859869 - time (sec): 9.06 - samples/sec: 3367.40 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:18:28,297 epoch 6 - iter 189/272 - loss 0.02820747 - time (sec): 10.48 - samples/sec: 3423.61 - lr: 0.000024 - momentum: 0.000000
2023-10-25 21:18:29,818 epoch 6 - iter 216/272 - loss 0.02595280 - time (sec): 12.00 - samples/sec: 3428.18 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:18:31,666 epoch 6 - iter 243/272 - loss 0.02675736 - time (sec): 13.85 - samples/sec: 3340.76 - lr: 0.000023 - momentum: 0.000000
2023-10-25 21:18:33,136 epoch 6 - iter 270/272 - loss 0.02531645 - time (sec): 15.32 - samples/sec: 3372.83 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:18:33,237 ----------------------------------------------------------------------------------------------------
2023-10-25 21:18:33,238 EPOCH 6 done: loss 0.0252 - lr: 0.000022
2023-10-25 21:18:34,449 DEV : loss 0.1511414498090744 - f1-score (micro avg) 0.8556
2023-10-25 21:18:34,457 saving best model
2023-10-25 21:18:35,122 ----------------------------------------------------------------------------------------------------
2023-10-25 21:18:36,598 epoch 7 - iter 27/272 - loss 0.00848877 - time (sec): 1.47 - samples/sec: 3491.50 - lr: 0.000022 - momentum: 0.000000
2023-10-25 21:18:38,116 epoch 7 - iter 54/272 - loss 0.01251791 - time (sec): 2.99 - samples/sec: 3486.97 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:18:39,650 epoch 7 - iter 81/272 - loss 0.02004690 - time (sec): 4.53 - samples/sec: 3391.27 - lr: 0.000021 - momentum: 0.000000
2023-10-25 21:18:41,138 epoch 7 - iter 108/272 - loss 0.02098249 - time (sec): 6.01 - samples/sec: 3482.94 - lr: 0.000020 - momentum: 0.000000
2023-10-25 21:18:42,649 epoch 7 - iter 135/272 - loss 0.02170136 - time (sec): 7.52 - samples/sec: 3444.78 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:18:44,177 epoch 7 - iter 162/272 - loss 0.02356662 - time (sec): 9.05 - samples/sec: 3526.03 - lr: 0.000019 - momentum: 0.000000
2023-10-25 21:18:45,659 epoch 7 - iter 189/272 - loss 0.02264799 - time (sec): 10.53 - samples/sec: 3481.29 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:18:47,128 epoch 7 - iter 216/272 - loss 0.02390930 - time (sec): 12.00 - samples/sec: 3497.71 - lr: 0.000018 - momentum: 0.000000
2023-10-25 21:18:48,634 epoch 7 - iter 243/272 - loss 0.02387702 - time (sec): 13.51 - samples/sec: 3464.23 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:18:50,197 epoch 7 - iter 270/272 - loss 0.02410360 - time (sec): 15.07 - samples/sec: 3423.05 - lr: 0.000017 - momentum: 0.000000
2023-10-25 21:18:50,306 ----------------------------------------------------------------------------------------------------
2023-10-25 21:18:50,306 EPOCH 7 done: loss 0.0239 - lr: 0.000017
2023-10-25 21:18:51,555 DEV : loss 0.15409183502197266 - f1-score (micro avg) 0.8355
2023-10-25 21:18:51,562 ----------------------------------------------------------------------------------------------------
2023-10-25 21:18:53,042 epoch 8 - iter 27/272 - loss 0.01560783 - time (sec): 1.48 - samples/sec: 3183.20 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:18:54,650 epoch 8 - iter 54/272 - loss 0.01094755 - time (sec): 3.09 - samples/sec: 3313.48 - lr: 0.000016 - momentum: 0.000000
2023-10-25 21:18:56,246 epoch 8 - iter 81/272 - loss 0.01203350 - time (sec): 4.68 - samples/sec: 3277.36 - lr: 0.000015 - momentum: 0.000000
2023-10-25 21:18:57,779 epoch 8 - iter 108/272 - loss 0.01261165 - time (sec): 6.22 - samples/sec: 3268.70 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:18:59,399 epoch 8 - iter 135/272 - loss 0.01335699 - time (sec): 7.84 - samples/sec: 3254.45 - lr: 0.000014 - momentum: 0.000000
2023-10-25 21:19:01,066 epoch 8 - iter 162/272 - loss 0.01326646 - time (sec): 9.50 - samples/sec: 3375.65 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:19:02,702 epoch 8 - iter 189/272 - loss 0.01258962 - time (sec): 11.14 - samples/sec: 3336.91 - lr: 0.000013 - momentum: 0.000000
2023-10-25 21:19:04,361 epoch 8 - iter 216/272 - loss 0.01307808 - time (sec): 12.80 - samples/sec: 3372.55 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:19:05,872 epoch 8 - iter 243/272 - loss 0.01357464 - time (sec): 14.31 - samples/sec: 3295.47 - lr: 0.000012 - momentum: 0.000000
2023-10-25 21:19:07,305 epoch 8 - iter 270/272 - loss 0.01324032 - time (sec): 15.74 - samples/sec: 3287.21 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:19:07,410 ----------------------------------------------------------------------------------------------------
2023-10-25 21:19:07,410 EPOCH 8 done: loss 0.0132 - lr: 0.000011
2023-10-25 21:19:08,648 DEV : loss 0.168454110622406 - f1-score (micro avg) 0.8476
2023-10-25 21:19:08,655 ----------------------------------------------------------------------------------------------------
2023-10-25 21:19:10,113 epoch 9 - iter 27/272 - loss 0.01007857 - time (sec): 1.46 - samples/sec: 3341.11 - lr: 0.000011 - momentum: 0.000000
2023-10-25 21:19:11,640 epoch 9 - iter 54/272 - loss 0.00854145 - time (sec): 2.98 - samples/sec: 3169.03 - lr: 0.000010 - momentum: 0.000000
2023-10-25 21:19:13,146 epoch 9 - iter 81/272 - loss 0.00716239 - time (sec): 4.49 - samples/sec: 3299.31 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:19:14,642 epoch 9 - iter 108/272 - loss 0.00776257 - time (sec): 5.99 - samples/sec: 3342.08 - lr: 0.000009 - momentum: 0.000000
2023-10-25 21:19:16,142 epoch 9 - iter 135/272 - loss 0.00999300 - time (sec): 7.49 - samples/sec: 3401.88 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:19:17,656 epoch 9 - iter 162/272 - loss 0.01006847 - time (sec): 9.00 - samples/sec: 3415.63 - lr: 0.000008 - momentum: 0.000000
2023-10-25 21:19:19,152 epoch 9 - iter 189/272 - loss 0.01126000 - time (sec): 10.50 - samples/sec: 3489.56 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:19:20,656 epoch 9 - iter 216/272 - loss 0.01056306 - time (sec): 12.00 - samples/sec: 3501.33 - lr: 0.000007 - momentum: 0.000000
2023-10-25 21:19:22,130 epoch 9 - iter 243/272 - loss 0.00967448 - time (sec): 13.47 - samples/sec: 3438.18 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:19:23,596 epoch 9 - iter 270/272 - loss 0.01005425 - time (sec): 14.94 - samples/sec: 3464.94 - lr: 0.000006 - momentum: 0.000000
2023-10-25 21:19:23,693 ----------------------------------------------------------------------------------------------------
2023-10-25 21:19:23,694 EPOCH 9 done: loss 0.0103 - lr: 0.000006
2023-10-25 21:19:24,857 DEV : loss 0.17289191484451294 - f1-score (micro avg) 0.8423
2023-10-25 21:19:24,864 ----------------------------------------------------------------------------------------------------
2023-10-25 21:19:26,356 epoch 10 - iter 27/272 - loss 0.01133191 - time (sec): 1.49 - samples/sec: 3075.68 - lr: 0.000005 - momentum: 0.000000
2023-10-25 21:19:28,102 epoch 10 - iter 54/272 - loss 0.01251702 - time (sec): 3.24 - samples/sec: 3020.24 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:19:29,595 epoch 10 - iter 81/272 - loss 0.00919290 - time (sec): 4.73 - samples/sec: 3165.49 - lr: 0.000004 - momentum: 0.000000
2023-10-25 21:19:31,113 epoch 10 - iter 108/272 - loss 0.00819876 - time (sec): 6.25 - samples/sec: 3284.98 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:19:32,594 epoch 10 - iter 135/272 - loss 0.00715582 - time (sec): 7.73 - samples/sec: 3260.31 - lr: 0.000003 - momentum: 0.000000
2023-10-25 21:19:34,046 epoch 10 - iter 162/272 - loss 0.00732270 - time (sec): 9.18 - samples/sec: 3335.48 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:19:35,625 epoch 10 - iter 189/272 - loss 0.00689220 - time (sec): 10.76 - samples/sec: 3376.85 - lr: 0.000002 - momentum: 0.000000
2023-10-25 21:19:37,126 epoch 10 - iter 216/272 - loss 0.00679103 - time (sec): 12.26 - samples/sec: 3394.36 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:19:38,674 epoch 10 - iter 243/272 - loss 0.00747472 - time (sec): 13.81 - samples/sec: 3363.94 - lr: 0.000001 - momentum: 0.000000
2023-10-25 21:19:40,224 epoch 10 - iter 270/272 - loss 0.00771969 - time (sec): 15.36 - samples/sec: 3367.15 - lr: 0.000000 - momentum: 0.000000
2023-10-25 21:19:40,334 ----------------------------------------------------------------------------------------------------
2023-10-25 21:19:40,334 EPOCH 10 done: loss 0.0077 - lr: 0.000000
2023-10-25 21:19:41,501 DEV : loss 0.17132313549518585 - f1-score (micro avg) 0.846
2023-10-25 21:19:41,994 ----------------------------------------------------------------------------------------------------
2023-10-25 21:19:41,995 Loading model from best epoch ...
2023-10-25 21:19:43,862 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-25 21:19:46,005
Results:
- F-score (micro) 0.778
- F-score (macro) 0.6984
- Accuracy 0.6549
By class:
precision recall f1-score support
LOC 0.8201 0.8910 0.8541 312
PER 0.6755 0.8606 0.7569 208
ORG 0.4426 0.4909 0.4655 55
HumanProd 0.6129 0.8636 0.7170 22
micro avg 0.7227 0.8425 0.7780 597
macro avg 0.6378 0.7765 0.6984 597
weighted avg 0.7273 0.8425 0.7794 597
2023-10-25 21:19:46,005 ----------------------------------------------------------------------------------------------------