|
2023-10-25 21:16:52,404 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:52,405 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(64001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-25 21:16:52,405 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:52,405 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-25 21:16:52,405 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:52,405 Train: 1085 sentences |
|
2023-10-25 21:16:52,405 (train_with_dev=False, train_with_test=False) |
|
2023-10-25 21:16:52,405 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:52,405 Training Params: |
|
2023-10-25 21:16:52,405 - learning_rate: "5e-05" |
|
2023-10-25 21:16:52,405 - mini_batch_size: "4" |
|
2023-10-25 21:16:52,405 - max_epochs: "10" |
|
2023-10-25 21:16:52,405 - shuffle: "True" |
|
2023-10-25 21:16:52,405 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:52,405 Plugins: |
|
2023-10-25 21:16:52,405 - TensorboardLogger |
|
2023-10-25 21:16:52,405 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-25 21:16:52,405 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:52,405 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-25 21:16:52,405 - metric: "('micro avg', 'f1-score')" |
|
2023-10-25 21:16:52,405 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:52,405 Computation: |
|
2023-10-25 21:16:52,406 - compute on device: cuda:0 |
|
2023-10-25 21:16:52,406 - embedding storage: none |
|
2023-10-25 21:16:52,406 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:52,406 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-25 21:16:52,406 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:52,406 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:16:52,406 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-25 21:16:54,002 epoch 1 - iter 27/272 - loss 2.73511101 - time (sec): 1.60 - samples/sec: 3794.26 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:16:55,455 epoch 1 - iter 54/272 - loss 1.94519428 - time (sec): 3.05 - samples/sec: 3629.33 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:16:57,006 epoch 1 - iter 81/272 - loss 1.57460458 - time (sec): 4.60 - samples/sec: 3365.47 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:16:58,543 epoch 1 - iter 108/272 - loss 1.29419222 - time (sec): 6.14 - samples/sec: 3339.84 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:17:00,067 epoch 1 - iter 135/272 - loss 1.10694561 - time (sec): 7.66 - samples/sec: 3289.62 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:17:01,650 epoch 1 - iter 162/272 - loss 0.94631686 - time (sec): 9.24 - samples/sec: 3336.01 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 21:17:03,210 epoch 1 - iter 189/272 - loss 0.83019629 - time (sec): 10.80 - samples/sec: 3385.74 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 21:17:04,765 epoch 1 - iter 216/272 - loss 0.74771962 - time (sec): 12.36 - samples/sec: 3387.51 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 21:17:06,315 epoch 1 - iter 243/272 - loss 0.69042563 - time (sec): 13.91 - samples/sec: 3346.12 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 21:17:07,824 epoch 1 - iter 270/272 - loss 0.63255195 - time (sec): 15.42 - samples/sec: 3364.57 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 21:17:07,926 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:07,926 EPOCH 1 done: loss 0.6322 - lr: 0.000049 |
|
2023-10-25 21:17:08,632 DEV : loss 0.13009829819202423 - f1-score (micro avg) 0.7173 |
|
2023-10-25 21:17:08,638 saving best model |
|
2023-10-25 21:17:09,112 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:10,637 epoch 2 - iter 27/272 - loss 0.13866994 - time (sec): 1.52 - samples/sec: 3615.71 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 21:17:12,103 epoch 2 - iter 54/272 - loss 0.13389945 - time (sec): 2.99 - samples/sec: 3410.61 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-25 21:17:13,617 epoch 2 - iter 81/272 - loss 0.14062734 - time (sec): 4.50 - samples/sec: 3333.18 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 21:17:15,200 epoch 2 - iter 108/272 - loss 0.14321239 - time (sec): 6.09 - samples/sec: 3358.82 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-25 21:17:16,776 epoch 2 - iter 135/272 - loss 0.15486460 - time (sec): 7.66 - samples/sec: 3395.19 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 21:17:18,379 epoch 2 - iter 162/272 - loss 0.14631541 - time (sec): 9.26 - samples/sec: 3379.73 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-25 21:17:19,943 epoch 2 - iter 189/272 - loss 0.13556396 - time (sec): 10.83 - samples/sec: 3419.67 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 21:17:21,475 epoch 2 - iter 216/272 - loss 0.13296855 - time (sec): 12.36 - samples/sec: 3426.75 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-25 21:17:22,992 epoch 2 - iter 243/272 - loss 0.13035999 - time (sec): 13.88 - samples/sec: 3405.22 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 21:17:24,519 epoch 2 - iter 270/272 - loss 0.12823982 - time (sec): 15.40 - samples/sec: 3344.32 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-25 21:17:24,625 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:24,625 EPOCH 2 done: loss 0.1271 - lr: 0.000045 |
|
2023-10-25 21:17:26,305 DEV : loss 0.12012875825166702 - f1-score (micro avg) 0.7755 |
|
2023-10-25 21:17:26,312 saving best model |
|
2023-10-25 21:17:26,974 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:28,496 epoch 3 - iter 27/272 - loss 0.05489200 - time (sec): 1.52 - samples/sec: 3451.25 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-25 21:17:30,027 epoch 3 - iter 54/272 - loss 0.05237056 - time (sec): 3.05 - samples/sec: 3163.33 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 21:17:31,543 epoch 3 - iter 81/272 - loss 0.04896805 - time (sec): 4.57 - samples/sec: 3251.03 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-25 21:17:33,047 epoch 3 - iter 108/272 - loss 0.05720601 - time (sec): 6.07 - samples/sec: 3167.80 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 21:17:34,582 epoch 3 - iter 135/272 - loss 0.06017766 - time (sec): 7.61 - samples/sec: 3239.90 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-25 21:17:36,066 epoch 3 - iter 162/272 - loss 0.06582158 - time (sec): 9.09 - samples/sec: 3282.43 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 21:17:37,640 epoch 3 - iter 189/272 - loss 0.06333105 - time (sec): 10.66 - samples/sec: 3318.01 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-25 21:17:39,236 epoch 3 - iter 216/272 - loss 0.06358727 - time (sec): 12.26 - samples/sec: 3353.15 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 21:17:40,822 epoch 3 - iter 243/272 - loss 0.06743678 - time (sec): 13.85 - samples/sec: 3390.62 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-25 21:17:42,290 epoch 3 - iter 270/272 - loss 0.06746195 - time (sec): 15.31 - samples/sec: 3375.19 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-25 21:17:42,389 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:42,389 EPOCH 3 done: loss 0.0678 - lr: 0.000039 |
|
2023-10-25 21:17:43,576 DEV : loss 0.13368070125579834 - f1-score (micro avg) 0.7856 |
|
2023-10-25 21:17:43,583 saving best model |
|
2023-10-25 21:17:44,277 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:45,762 epoch 4 - iter 27/272 - loss 0.02597351 - time (sec): 1.48 - samples/sec: 3438.51 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 21:17:47,241 epoch 4 - iter 54/272 - loss 0.03908671 - time (sec): 2.96 - samples/sec: 3412.08 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-25 21:17:48,764 epoch 4 - iter 81/272 - loss 0.04178419 - time (sec): 4.49 - samples/sec: 3459.41 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 21:17:50,232 epoch 4 - iter 108/272 - loss 0.04031066 - time (sec): 5.95 - samples/sec: 3453.62 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-25 21:17:51,669 epoch 4 - iter 135/272 - loss 0.04375782 - time (sec): 7.39 - samples/sec: 3397.84 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 21:17:53,165 epoch 4 - iter 162/272 - loss 0.04117904 - time (sec): 8.89 - samples/sec: 3448.55 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-25 21:17:54,651 epoch 4 - iter 189/272 - loss 0.04053027 - time (sec): 10.37 - samples/sec: 3436.56 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-25 21:17:56,177 epoch 4 - iter 216/272 - loss 0.04161959 - time (sec): 11.90 - samples/sec: 3484.43 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 21:17:57,662 epoch 4 - iter 243/272 - loss 0.04033603 - time (sec): 13.38 - samples/sec: 3500.81 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-25 21:17:59,154 epoch 4 - iter 270/272 - loss 0.04150719 - time (sec): 14.88 - samples/sec: 3481.84 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 21:17:59,263 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:17:59,264 EPOCH 4 done: loss 0.0414 - lr: 0.000033 |
|
2023-10-25 21:18:00,579 DEV : loss 0.16597497463226318 - f1-score (micro avg) 0.7477 |
|
2023-10-25 21:18:00,586 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:18:02,148 epoch 5 - iter 27/272 - loss 0.02656488 - time (sec): 1.56 - samples/sec: 3420.72 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-25 21:18:03,672 epoch 5 - iter 54/272 - loss 0.03046942 - time (sec): 3.08 - samples/sec: 3250.62 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 21:18:05,134 epoch 5 - iter 81/272 - loss 0.02530309 - time (sec): 4.55 - samples/sec: 3316.03 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-25 21:18:06,608 epoch 5 - iter 108/272 - loss 0.03298758 - time (sec): 6.02 - samples/sec: 3322.10 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 21:18:08,118 epoch 5 - iter 135/272 - loss 0.02852542 - time (sec): 7.53 - samples/sec: 3390.72 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-25 21:18:09,598 epoch 5 - iter 162/272 - loss 0.02918451 - time (sec): 9.01 - samples/sec: 3387.66 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-25 21:18:11,085 epoch 5 - iter 189/272 - loss 0.02740291 - time (sec): 10.50 - samples/sec: 3304.99 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:18:12,583 epoch 5 - iter 216/272 - loss 0.02813313 - time (sec): 12.00 - samples/sec: 3345.46 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-25 21:18:14,147 epoch 5 - iter 243/272 - loss 0.03233798 - time (sec): 13.56 - samples/sec: 3404.86 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:18:15,739 epoch 5 - iter 270/272 - loss 0.03072248 - time (sec): 15.15 - samples/sec: 3414.00 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-25 21:18:15,853 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:18:15,854 EPOCH 5 done: loss 0.0306 - lr: 0.000028 |
|
2023-10-25 21:18:17,134 DEV : loss 0.14486266672611237 - f1-score (micro avg) 0.8346 |
|
2023-10-25 21:18:17,141 saving best model |
|
2023-10-25 21:18:17,813 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:18:19,374 epoch 6 - iter 27/272 - loss 0.01515445 - time (sec): 1.56 - samples/sec: 3975.21 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:18:20,932 epoch 6 - iter 54/272 - loss 0.01925765 - time (sec): 3.12 - samples/sec: 3807.21 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-25 21:18:22,422 epoch 6 - iter 81/272 - loss 0.01952897 - time (sec): 4.61 - samples/sec: 3499.39 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:18:23,949 epoch 6 - iter 108/272 - loss 0.02282516 - time (sec): 6.13 - samples/sec: 3440.84 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-25 21:18:25,451 epoch 6 - iter 135/272 - loss 0.02150874 - time (sec): 7.63 - samples/sec: 3381.01 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-25 21:18:26,872 epoch 6 - iter 162/272 - loss 0.02859869 - time (sec): 9.06 - samples/sec: 3367.40 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:18:28,297 epoch 6 - iter 189/272 - loss 0.02820747 - time (sec): 10.48 - samples/sec: 3423.61 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-25 21:18:29,818 epoch 6 - iter 216/272 - loss 0.02595280 - time (sec): 12.00 - samples/sec: 3428.18 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:18:31,666 epoch 6 - iter 243/272 - loss 0.02675736 - time (sec): 13.85 - samples/sec: 3340.76 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-25 21:18:33,136 epoch 6 - iter 270/272 - loss 0.02531645 - time (sec): 15.32 - samples/sec: 3372.83 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:18:33,237 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:18:33,238 EPOCH 6 done: loss 0.0252 - lr: 0.000022 |
|
2023-10-25 21:18:34,449 DEV : loss 0.1511414498090744 - f1-score (micro avg) 0.8556 |
|
2023-10-25 21:18:34,457 saving best model |
|
2023-10-25 21:18:35,122 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:18:36,598 epoch 7 - iter 27/272 - loss 0.00848877 - time (sec): 1.47 - samples/sec: 3491.50 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-25 21:18:38,116 epoch 7 - iter 54/272 - loss 0.01251791 - time (sec): 2.99 - samples/sec: 3486.97 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:18:39,650 epoch 7 - iter 81/272 - loss 0.02004690 - time (sec): 4.53 - samples/sec: 3391.27 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-25 21:18:41,138 epoch 7 - iter 108/272 - loss 0.02098249 - time (sec): 6.01 - samples/sec: 3482.94 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-25 21:18:42,649 epoch 7 - iter 135/272 - loss 0.02170136 - time (sec): 7.52 - samples/sec: 3444.78 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:18:44,177 epoch 7 - iter 162/272 - loss 0.02356662 - time (sec): 9.05 - samples/sec: 3526.03 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-25 21:18:45,659 epoch 7 - iter 189/272 - loss 0.02264799 - time (sec): 10.53 - samples/sec: 3481.29 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:18:47,128 epoch 7 - iter 216/272 - loss 0.02390930 - time (sec): 12.00 - samples/sec: 3497.71 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-25 21:18:48,634 epoch 7 - iter 243/272 - loss 0.02387702 - time (sec): 13.51 - samples/sec: 3464.23 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:18:50,197 epoch 7 - iter 270/272 - loss 0.02410360 - time (sec): 15.07 - samples/sec: 3423.05 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-25 21:18:50,306 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:18:50,306 EPOCH 7 done: loss 0.0239 - lr: 0.000017 |
|
2023-10-25 21:18:51,555 DEV : loss 0.15409183502197266 - f1-score (micro avg) 0.8355 |
|
2023-10-25 21:18:51,562 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:18:53,042 epoch 8 - iter 27/272 - loss 0.01560783 - time (sec): 1.48 - samples/sec: 3183.20 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:18:54,650 epoch 8 - iter 54/272 - loss 0.01094755 - time (sec): 3.09 - samples/sec: 3313.48 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-25 21:18:56,246 epoch 8 - iter 81/272 - loss 0.01203350 - time (sec): 4.68 - samples/sec: 3277.36 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-25 21:18:57,779 epoch 8 - iter 108/272 - loss 0.01261165 - time (sec): 6.22 - samples/sec: 3268.70 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:18:59,399 epoch 8 - iter 135/272 - loss 0.01335699 - time (sec): 7.84 - samples/sec: 3254.45 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-25 21:19:01,066 epoch 8 - iter 162/272 - loss 0.01326646 - time (sec): 9.50 - samples/sec: 3375.65 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:19:02,702 epoch 8 - iter 189/272 - loss 0.01258962 - time (sec): 11.14 - samples/sec: 3336.91 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-25 21:19:04,361 epoch 8 - iter 216/272 - loss 0.01307808 - time (sec): 12.80 - samples/sec: 3372.55 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:19:05,872 epoch 8 - iter 243/272 - loss 0.01357464 - time (sec): 14.31 - samples/sec: 3295.47 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-25 21:19:07,305 epoch 8 - iter 270/272 - loss 0.01324032 - time (sec): 15.74 - samples/sec: 3287.21 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:19:07,410 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:19:07,410 EPOCH 8 done: loss 0.0132 - lr: 0.000011 |
|
2023-10-25 21:19:08,648 DEV : loss 0.168454110622406 - f1-score (micro avg) 0.8476 |
|
2023-10-25 21:19:08,655 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:19:10,113 epoch 9 - iter 27/272 - loss 0.01007857 - time (sec): 1.46 - samples/sec: 3341.11 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-25 21:19:11,640 epoch 9 - iter 54/272 - loss 0.00854145 - time (sec): 2.98 - samples/sec: 3169.03 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-25 21:19:13,146 epoch 9 - iter 81/272 - loss 0.00716239 - time (sec): 4.49 - samples/sec: 3299.31 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:19:14,642 epoch 9 - iter 108/272 - loss 0.00776257 - time (sec): 5.99 - samples/sec: 3342.08 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-25 21:19:16,142 epoch 9 - iter 135/272 - loss 0.00999300 - time (sec): 7.49 - samples/sec: 3401.88 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:19:17,656 epoch 9 - iter 162/272 - loss 0.01006847 - time (sec): 9.00 - samples/sec: 3415.63 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-25 21:19:19,152 epoch 9 - iter 189/272 - loss 0.01126000 - time (sec): 10.50 - samples/sec: 3489.56 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:19:20,656 epoch 9 - iter 216/272 - loss 0.01056306 - time (sec): 12.00 - samples/sec: 3501.33 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-25 21:19:22,130 epoch 9 - iter 243/272 - loss 0.00967448 - time (sec): 13.47 - samples/sec: 3438.18 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:19:23,596 epoch 9 - iter 270/272 - loss 0.01005425 - time (sec): 14.94 - samples/sec: 3464.94 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-25 21:19:23,693 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:19:23,694 EPOCH 9 done: loss 0.0103 - lr: 0.000006 |
|
2023-10-25 21:19:24,857 DEV : loss 0.17289191484451294 - f1-score (micro avg) 0.8423 |
|
2023-10-25 21:19:24,864 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:19:26,356 epoch 10 - iter 27/272 - loss 0.01133191 - time (sec): 1.49 - samples/sec: 3075.68 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-25 21:19:28,102 epoch 10 - iter 54/272 - loss 0.01251702 - time (sec): 3.24 - samples/sec: 3020.24 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:19:29,595 epoch 10 - iter 81/272 - loss 0.00919290 - time (sec): 4.73 - samples/sec: 3165.49 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-25 21:19:31,113 epoch 10 - iter 108/272 - loss 0.00819876 - time (sec): 6.25 - samples/sec: 3284.98 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:19:32,594 epoch 10 - iter 135/272 - loss 0.00715582 - time (sec): 7.73 - samples/sec: 3260.31 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-25 21:19:34,046 epoch 10 - iter 162/272 - loss 0.00732270 - time (sec): 9.18 - samples/sec: 3335.48 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:19:35,625 epoch 10 - iter 189/272 - loss 0.00689220 - time (sec): 10.76 - samples/sec: 3376.85 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-25 21:19:37,126 epoch 10 - iter 216/272 - loss 0.00679103 - time (sec): 12.26 - samples/sec: 3394.36 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:19:38,674 epoch 10 - iter 243/272 - loss 0.00747472 - time (sec): 13.81 - samples/sec: 3363.94 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-25 21:19:40,224 epoch 10 - iter 270/272 - loss 0.00771969 - time (sec): 15.36 - samples/sec: 3367.15 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-25 21:19:40,334 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:19:40,334 EPOCH 10 done: loss 0.0077 - lr: 0.000000 |
|
2023-10-25 21:19:41,501 DEV : loss 0.17132313549518585 - f1-score (micro avg) 0.846 |
|
2023-10-25 21:19:41,994 ---------------------------------------------------------------------------------------------------- |
|
2023-10-25 21:19:41,995 Loading model from best epoch ... |
|
2023-10-25 21:19:43,862 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-25 21:19:46,005 |
|
Results: |
|
- F-score (micro) 0.778 |
|
- F-score (macro) 0.6984 |
|
- Accuracy 0.6549 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.8201 0.8910 0.8541 312 |
|
PER 0.6755 0.8606 0.7569 208 |
|
ORG 0.4426 0.4909 0.4655 55 |
|
HumanProd 0.6129 0.8636 0.7170 22 |
|
|
|
micro avg 0.7227 0.8425 0.7780 597 |
|
macro avg 0.6378 0.7765 0.6984 597 |
|
weighted avg 0.7273 0.8425 0.7794 597 |
|
|
|
2023-10-25 21:19:46,005 ---------------------------------------------------------------------------------------------------- |
|
|