2023-10-23 15:01:13,895 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:13,896 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-23 15:01:13,896 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:13,896 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-23 15:01:13,896 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:13,896 Train: 1100 sentences 2023-10-23 15:01:13,896 (train_with_dev=False, train_with_test=False) 2023-10-23 15:01:13,896 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:13,896 Training Params: 2023-10-23 15:01:13,896 - learning_rate: "3e-05" 2023-10-23 15:01:13,896 - mini_batch_size: "4" 2023-10-23 15:01:13,896 - max_epochs: "10" 2023-10-23 15:01:13,896 - shuffle: "True" 2023-10-23 15:01:13,896 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:13,896 Plugins: 2023-10-23 15:01:13,896 - TensorboardLogger 2023-10-23 15:01:13,896 - LinearScheduler | warmup_fraction: '0.1' 2023-10-23 15:01:13,896 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:13,896 Final evaluation on model from best epoch (best-model.pt) 2023-10-23 15:01:13,896 - metric: "('micro avg', 'f1-score')" 2023-10-23 15:01:13,896 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:13,896 Computation: 2023-10-23 15:01:13,897 - compute on device: cuda:0 2023-10-23 15:01:13,897 - embedding storage: none 2023-10-23 15:01:13,897 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:13,897 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-23 15:01:13,897 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:13,897 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:13,897 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-23 15:01:15,272 epoch 1 - iter 27/275 - loss 3.25393908 - time (sec): 1.37 - samples/sec: 1669.52 - lr: 0.000003 - momentum: 0.000000 2023-10-23 15:01:16,649 epoch 1 - iter 54/275 - loss 2.57464096 - time (sec): 2.75 - samples/sec: 1607.22 - lr: 0.000006 - momentum: 0.000000 2023-10-23 15:01:18,015 epoch 1 - iter 81/275 - loss 2.06080207 - time (sec): 4.12 - samples/sec: 1546.21 - lr: 0.000009 - momentum: 0.000000 2023-10-23 15:01:19,383 epoch 1 - iter 108/275 - loss 1.71478848 - time (sec): 5.48 - samples/sec: 1528.00 - lr: 0.000012 - momentum: 0.000000 2023-10-23 15:01:20,785 epoch 1 - iter 135/275 - loss 1.47161691 - time (sec): 6.89 - samples/sec: 1569.93 - lr: 0.000015 - momentum: 0.000000 2023-10-23 15:01:22,187 epoch 1 - iter 162/275 - loss 1.28629244 - time (sec): 8.29 - samples/sec: 1588.83 - lr: 0.000018 - momentum: 0.000000 2023-10-23 15:01:23,575 epoch 1 - iter 189/275 - loss 1.14942489 - time (sec): 9.68 - samples/sec: 1602.90 - lr: 0.000021 - momentum: 0.000000 2023-10-23 15:01:24,996 epoch 1 - iter 216/275 - loss 1.03789197 - time (sec): 11.10 - samples/sec: 1607.47 - lr: 0.000023 - momentum: 0.000000 2023-10-23 15:01:26,380 epoch 1 - iter 243/275 - loss 0.94916015 - time (sec): 12.48 - samples/sec: 1602.47 - lr: 0.000026 - momentum: 0.000000 2023-10-23 15:01:27,757 epoch 1 - iter 270/275 - loss 0.87843050 - time (sec): 13.86 - samples/sec: 1610.25 - lr: 0.000029 - momentum: 0.000000 2023-10-23 15:01:28,016 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:28,016 EPOCH 1 done: loss 0.8687 - lr: 0.000029 2023-10-23 15:01:28,433 DEV : loss 0.1798924207687378 - f1-score (micro avg) 0.7726 2023-10-23 15:01:28,438 saving best model 2023-10-23 15:01:28,837 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:30,227 epoch 2 - iter 27/275 - loss 0.13016473 - time (sec): 1.39 - samples/sec: 1562.72 - lr: 0.000030 - momentum: 0.000000 2023-10-23 15:01:31,622 epoch 2 - iter 54/275 - loss 0.15754978 - time (sec): 2.78 - samples/sec: 1547.47 - lr: 0.000029 - momentum: 0.000000 2023-10-23 15:01:33,017 epoch 2 - iter 81/275 - loss 0.17446397 - time (sec): 4.18 - samples/sec: 1542.82 - lr: 0.000029 - momentum: 0.000000 2023-10-23 15:01:34,403 epoch 2 - iter 108/275 - loss 0.18375447 - time (sec): 5.56 - samples/sec: 1577.39 - lr: 0.000029 - momentum: 0.000000 2023-10-23 15:01:35,796 epoch 2 - iter 135/275 - loss 0.18527314 - time (sec): 6.96 - samples/sec: 1565.96 - lr: 0.000028 - momentum: 0.000000 2023-10-23 15:01:37,217 epoch 2 - iter 162/275 - loss 0.17131923 - time (sec): 8.38 - samples/sec: 1552.03 - lr: 0.000028 - momentum: 0.000000 2023-10-23 15:01:38,613 epoch 2 - iter 189/275 - loss 0.16272828 - time (sec): 9.77 - samples/sec: 1548.36 - lr: 0.000028 - momentum: 0.000000 2023-10-23 15:01:40,029 epoch 2 - iter 216/275 - loss 0.16178130 - time (sec): 11.19 - samples/sec: 1585.99 - lr: 0.000027 - momentum: 0.000000 2023-10-23 15:01:41,427 epoch 2 - iter 243/275 - loss 0.15174993 - time (sec): 12.59 - samples/sec: 1580.54 - lr: 0.000027 - momentum: 0.000000 2023-10-23 15:01:42,820 epoch 2 - iter 270/275 - loss 0.15643905 - time (sec): 13.98 - samples/sec: 1597.35 - lr: 0.000027 - momentum: 0.000000 2023-10-23 15:01:43,079 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:43,080 EPOCH 2 done: loss 0.1571 - lr: 0.000027 2023-10-23 15:01:43,634 DEV : loss 0.12782859802246094 - f1-score (micro avg) 0.8339 2023-10-23 15:01:43,639 saving best model 2023-10-23 15:01:44,198 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:45,588 epoch 3 - iter 27/275 - loss 0.07572565 - time (sec): 1.39 - samples/sec: 1631.46 - lr: 0.000026 - momentum: 0.000000 2023-10-23 15:01:46,995 epoch 3 - iter 54/275 - loss 0.08926542 - time (sec): 2.79 - samples/sec: 1557.60 - lr: 0.000026 - momentum: 0.000000 2023-10-23 15:01:48,380 epoch 3 - iter 81/275 - loss 0.08635936 - time (sec): 4.18 - samples/sec: 1504.36 - lr: 0.000026 - momentum: 0.000000 2023-10-23 15:01:49,820 epoch 3 - iter 108/275 - loss 0.09552644 - time (sec): 5.62 - samples/sec: 1554.91 - lr: 0.000025 - momentum: 0.000000 2023-10-23 15:01:51,327 epoch 3 - iter 135/275 - loss 0.09493698 - time (sec): 7.12 - samples/sec: 1562.41 - lr: 0.000025 - momentum: 0.000000 2023-10-23 15:01:52,836 epoch 3 - iter 162/275 - loss 0.10197681 - time (sec): 8.63 - samples/sec: 1562.70 - lr: 0.000025 - momentum: 0.000000 2023-10-23 15:01:54,320 epoch 3 - iter 189/275 - loss 0.10219325 - time (sec): 10.12 - samples/sec: 1546.45 - lr: 0.000024 - momentum: 0.000000 2023-10-23 15:01:55,807 epoch 3 - iter 216/275 - loss 0.10502848 - time (sec): 11.60 - samples/sec: 1523.56 - lr: 0.000024 - momentum: 0.000000 2023-10-23 15:01:57,305 epoch 3 - iter 243/275 - loss 0.09950956 - time (sec): 13.10 - samples/sec: 1518.30 - lr: 0.000024 - momentum: 0.000000 2023-10-23 15:01:58,832 epoch 3 - iter 270/275 - loss 0.09846269 - time (sec): 14.63 - samples/sec: 1528.72 - lr: 0.000023 - momentum: 0.000000 2023-10-23 15:01:59,105 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:01:59,105 EPOCH 3 done: loss 0.0985 - lr: 0.000023 2023-10-23 15:01:59,645 DEV : loss 0.12585023045539856 - f1-score (micro avg) 0.8524 2023-10-23 15:01:59,652 saving best model 2023-10-23 15:02:00,197 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:02:01,708 epoch 4 - iter 27/275 - loss 0.07581123 - time (sec): 1.51 - samples/sec: 1483.21 - lr: 0.000023 - momentum: 0.000000 2023-10-23 15:02:03,216 epoch 4 - iter 54/275 - loss 0.06572517 - time (sec): 3.01 - samples/sec: 1458.76 - lr: 0.000023 - momentum: 0.000000 2023-10-23 15:02:04,687 epoch 4 - iter 81/275 - loss 0.07992442 - time (sec): 4.49 - samples/sec: 1548.84 - lr: 0.000022 - momentum: 0.000000 2023-10-23 15:02:06,103 epoch 4 - iter 108/275 - loss 0.07802940 - time (sec): 5.90 - samples/sec: 1560.87 - lr: 0.000022 - momentum: 0.000000 2023-10-23 15:02:07,653 epoch 4 - iter 135/275 - loss 0.08316745 - time (sec): 7.45 - samples/sec: 1514.72 - lr: 0.000022 - momentum: 0.000000 2023-10-23 15:02:09,083 epoch 4 - iter 162/275 - loss 0.07750285 - time (sec): 8.88 - samples/sec: 1539.09 - lr: 0.000021 - momentum: 0.000000 2023-10-23 15:02:10,505 epoch 4 - iter 189/275 - loss 0.07107305 - time (sec): 10.30 - samples/sec: 1524.37 - lr: 0.000021 - momentum: 0.000000 2023-10-23 15:02:11,936 epoch 4 - iter 216/275 - loss 0.06532731 - time (sec): 11.73 - samples/sec: 1537.98 - lr: 0.000021 - momentum: 0.000000 2023-10-23 15:02:13,349 epoch 4 - iter 243/275 - loss 0.06597475 - time (sec): 13.15 - samples/sec: 1533.71 - lr: 0.000020 - momentum: 0.000000 2023-10-23 15:02:14,760 epoch 4 - iter 270/275 - loss 0.06189829 - time (sec): 14.56 - samples/sec: 1539.77 - lr: 0.000020 - momentum: 0.000000 2023-10-23 15:02:15,016 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:02:15,016 EPOCH 4 done: loss 0.0609 - lr: 0.000020 2023-10-23 15:02:15,560 DEV : loss 0.1704767942428589 - f1-score (micro avg) 0.8595 2023-10-23 15:02:15,566 saving best model 2023-10-23 15:02:16,119 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:02:17,500 epoch 5 - iter 27/275 - loss 0.08075043 - time (sec): 1.38 - samples/sec: 1537.85 - lr: 0.000020 - momentum: 0.000000 2023-10-23 15:02:18,937 epoch 5 - iter 54/275 - loss 0.04771189 - time (sec): 2.81 - samples/sec: 1599.62 - lr: 0.000019 - momentum: 0.000000 2023-10-23 15:02:20,408 epoch 5 - iter 81/275 - loss 0.04589936 - time (sec): 4.28 - samples/sec: 1553.83 - lr: 0.000019 - momentum: 0.000000 2023-10-23 15:02:21,827 epoch 5 - iter 108/275 - loss 0.04625470 - time (sec): 5.70 - samples/sec: 1557.53 - lr: 0.000019 - momentum: 0.000000 2023-10-23 15:02:23,256 epoch 5 - iter 135/275 - loss 0.05076116 - time (sec): 7.13 - samples/sec: 1542.98 - lr: 0.000018 - momentum: 0.000000 2023-10-23 15:02:24,653 epoch 5 - iter 162/275 - loss 0.05107033 - time (sec): 8.53 - samples/sec: 1580.55 - lr: 0.000018 - momentum: 0.000000 2023-10-23 15:02:26,053 epoch 5 - iter 189/275 - loss 0.05185483 - time (sec): 9.93 - samples/sec: 1584.96 - lr: 0.000018 - momentum: 0.000000 2023-10-23 15:02:27,463 epoch 5 - iter 216/275 - loss 0.04858783 - time (sec): 11.34 - samples/sec: 1574.70 - lr: 0.000017 - momentum: 0.000000 2023-10-23 15:02:28,833 epoch 5 - iter 243/275 - loss 0.04990404 - time (sec): 12.71 - samples/sec: 1583.66 - lr: 0.000017 - momentum: 0.000000 2023-10-23 15:02:30,264 epoch 5 - iter 270/275 - loss 0.05114650 - time (sec): 14.14 - samples/sec: 1586.53 - lr: 0.000017 - momentum: 0.000000 2023-10-23 15:02:30,522 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:02:30,522 EPOCH 5 done: loss 0.0509 - lr: 0.000017 2023-10-23 15:02:31,058 DEV : loss 0.14342500269412994 - f1-score (micro avg) 0.8789 2023-10-23 15:02:31,064 saving best model 2023-10-23 15:02:31,594 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:02:33,008 epoch 6 - iter 27/275 - loss 0.05416127 - time (sec): 1.41 - samples/sec: 1747.19 - lr: 0.000016 - momentum: 0.000000 2023-10-23 15:02:34,424 epoch 6 - iter 54/275 - loss 0.03374837 - time (sec): 2.83 - samples/sec: 1743.42 - lr: 0.000016 - momentum: 0.000000 2023-10-23 15:02:35,844 epoch 6 - iter 81/275 - loss 0.04839805 - time (sec): 4.25 - samples/sec: 1687.75 - lr: 0.000016 - momentum: 0.000000 2023-10-23 15:02:37,247 epoch 6 - iter 108/275 - loss 0.04789575 - time (sec): 5.65 - samples/sec: 1596.63 - lr: 0.000015 - momentum: 0.000000 2023-10-23 15:02:38,672 epoch 6 - iter 135/275 - loss 0.04182854 - time (sec): 7.08 - samples/sec: 1586.27 - lr: 0.000015 - momentum: 0.000000 2023-10-23 15:02:40,121 epoch 6 - iter 162/275 - loss 0.04156850 - time (sec): 8.52 - samples/sec: 1599.60 - lr: 0.000015 - momentum: 0.000000 2023-10-23 15:02:41,575 epoch 6 - iter 189/275 - loss 0.03833196 - time (sec): 9.98 - samples/sec: 1604.29 - lr: 0.000014 - momentum: 0.000000 2023-10-23 15:02:42,992 epoch 6 - iter 216/275 - loss 0.03681823 - time (sec): 11.40 - samples/sec: 1591.12 - lr: 0.000014 - momentum: 0.000000 2023-10-23 15:02:44,434 epoch 6 - iter 243/275 - loss 0.03594981 - time (sec): 12.84 - samples/sec: 1586.74 - lr: 0.000014 - momentum: 0.000000 2023-10-23 15:02:45,847 epoch 6 - iter 270/275 - loss 0.03433456 - time (sec): 14.25 - samples/sec: 1574.51 - lr: 0.000013 - momentum: 0.000000 2023-10-23 15:02:46,104 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:02:46,104 EPOCH 6 done: loss 0.0346 - lr: 0.000013 2023-10-23 15:02:46,637 DEV : loss 0.13642658293247223 - f1-score (micro avg) 0.8814 2023-10-23 15:02:46,643 saving best model 2023-10-23 15:02:47,200 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:02:48,608 epoch 7 - iter 27/275 - loss 0.02650342 - time (sec): 1.40 - samples/sec: 1417.21 - lr: 0.000013 - momentum: 0.000000 2023-10-23 15:02:50,006 epoch 7 - iter 54/275 - loss 0.01385811 - time (sec): 2.80 - samples/sec: 1553.85 - lr: 0.000013 - momentum: 0.000000 2023-10-23 15:02:51,397 epoch 7 - iter 81/275 - loss 0.01617586 - time (sec): 4.19 - samples/sec: 1604.95 - lr: 0.000012 - momentum: 0.000000 2023-10-23 15:02:52,792 epoch 7 - iter 108/275 - loss 0.02170695 - time (sec): 5.59 - samples/sec: 1559.56 - lr: 0.000012 - momentum: 0.000000 2023-10-23 15:02:54,192 epoch 7 - iter 135/275 - loss 0.02142902 - time (sec): 6.99 - samples/sec: 1615.02 - lr: 0.000012 - momentum: 0.000000 2023-10-23 15:02:55,580 epoch 7 - iter 162/275 - loss 0.02338113 - time (sec): 8.38 - samples/sec: 1616.77 - lr: 0.000011 - momentum: 0.000000 2023-10-23 15:02:56,984 epoch 7 - iter 189/275 - loss 0.02495879 - time (sec): 9.78 - samples/sec: 1603.29 - lr: 0.000011 - momentum: 0.000000 2023-10-23 15:02:58,384 epoch 7 - iter 216/275 - loss 0.02648994 - time (sec): 11.18 - samples/sec: 1621.75 - lr: 0.000011 - momentum: 0.000000 2023-10-23 15:02:59,784 epoch 7 - iter 243/275 - loss 0.02714924 - time (sec): 12.58 - samples/sec: 1620.14 - lr: 0.000010 - momentum: 0.000000 2023-10-23 15:03:01,203 epoch 7 - iter 270/275 - loss 0.02572334 - time (sec): 14.00 - samples/sec: 1600.82 - lr: 0.000010 - momentum: 0.000000 2023-10-23 15:03:01,459 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:03:01,459 EPOCH 7 done: loss 0.0261 - lr: 0.000010 2023-10-23 15:03:02,002 DEV : loss 0.14091593027114868 - f1-score (micro avg) 0.8889 2023-10-23 15:03:02,008 saving best model 2023-10-23 15:03:02,559 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:03:03,964 epoch 8 - iter 27/275 - loss 0.00849142 - time (sec): 1.40 - samples/sec: 1639.27 - lr: 0.000010 - momentum: 0.000000 2023-10-23 15:03:05,368 epoch 8 - iter 54/275 - loss 0.01191158 - time (sec): 2.80 - samples/sec: 1653.99 - lr: 0.000009 - momentum: 0.000000 2023-10-23 15:03:06,755 epoch 8 - iter 81/275 - loss 0.01732017 - time (sec): 4.19 - samples/sec: 1601.75 - lr: 0.000009 - momentum: 0.000000 2023-10-23 15:03:08,151 epoch 8 - iter 108/275 - loss 0.01442721 - time (sec): 5.59 - samples/sec: 1617.09 - lr: 0.000009 - momentum: 0.000000 2023-10-23 15:03:09,545 epoch 8 - iter 135/275 - loss 0.01458848 - time (sec): 6.98 - samples/sec: 1643.43 - lr: 0.000008 - momentum: 0.000000 2023-10-23 15:03:10,940 epoch 8 - iter 162/275 - loss 0.01509681 - time (sec): 8.38 - samples/sec: 1649.02 - lr: 0.000008 - momentum: 0.000000 2023-10-23 15:03:12,328 epoch 8 - iter 189/275 - loss 0.01342641 - time (sec): 9.76 - samples/sec: 1631.85 - lr: 0.000008 - momentum: 0.000000 2023-10-23 15:03:13,716 epoch 8 - iter 216/275 - loss 0.01227183 - time (sec): 11.15 - samples/sec: 1622.71 - lr: 0.000007 - momentum: 0.000000 2023-10-23 15:03:15,113 epoch 8 - iter 243/275 - loss 0.01648427 - time (sec): 12.55 - samples/sec: 1612.23 - lr: 0.000007 - momentum: 0.000000 2023-10-23 15:03:16,499 epoch 8 - iter 270/275 - loss 0.01889030 - time (sec): 13.94 - samples/sec: 1599.89 - lr: 0.000007 - momentum: 0.000000 2023-10-23 15:03:16,766 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:03:16,766 EPOCH 8 done: loss 0.0199 - lr: 0.000007 2023-10-23 15:03:17,321 DEV : loss 0.14507058262825012 - f1-score (micro avg) 0.8988 2023-10-23 15:03:17,327 saving best model 2023-10-23 15:03:17,877 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:03:19,270 epoch 9 - iter 27/275 - loss 0.01029226 - time (sec): 1.39 - samples/sec: 1423.42 - lr: 0.000006 - momentum: 0.000000 2023-10-23 15:03:20,672 epoch 9 - iter 54/275 - loss 0.01788155 - time (sec): 2.79 - samples/sec: 1558.13 - lr: 0.000006 - momentum: 0.000000 2023-10-23 15:03:22,096 epoch 9 - iter 81/275 - loss 0.01234753 - time (sec): 4.21 - samples/sec: 1601.89 - lr: 0.000006 - momentum: 0.000000 2023-10-23 15:03:23,487 epoch 9 - iter 108/275 - loss 0.01125364 - time (sec): 5.60 - samples/sec: 1573.81 - lr: 0.000005 - momentum: 0.000000 2023-10-23 15:03:24,889 epoch 9 - iter 135/275 - loss 0.01157687 - time (sec): 7.01 - samples/sec: 1580.38 - lr: 0.000005 - momentum: 0.000000 2023-10-23 15:03:26,273 epoch 9 - iter 162/275 - loss 0.01077598 - time (sec): 8.39 - samples/sec: 1573.70 - lr: 0.000005 - momentum: 0.000000 2023-10-23 15:03:27,605 epoch 9 - iter 189/275 - loss 0.01072423 - time (sec): 9.72 - samples/sec: 1611.81 - lr: 0.000004 - momentum: 0.000000 2023-10-23 15:03:28,911 epoch 9 - iter 216/275 - loss 0.01415302 - time (sec): 11.03 - samples/sec: 1622.47 - lr: 0.000004 - momentum: 0.000000 2023-10-23 15:03:30,231 epoch 9 - iter 243/275 - loss 0.01322181 - time (sec): 12.35 - samples/sec: 1621.65 - lr: 0.000004 - momentum: 0.000000 2023-10-23 15:03:31,542 epoch 9 - iter 270/275 - loss 0.01419018 - time (sec): 13.66 - samples/sec: 1635.47 - lr: 0.000003 - momentum: 0.000000 2023-10-23 15:03:31,788 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:03:31,788 EPOCH 9 done: loss 0.0141 - lr: 0.000003 2023-10-23 15:03:32,327 DEV : loss 0.15191467106342316 - f1-score (micro avg) 0.89 2023-10-23 15:03:32,333 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:03:33,635 epoch 10 - iter 27/275 - loss 0.00481745 - time (sec): 1.30 - samples/sec: 1678.25 - lr: 0.000003 - momentum: 0.000000 2023-10-23 15:03:34,957 epoch 10 - iter 54/275 - loss 0.00245864 - time (sec): 2.62 - samples/sec: 1662.82 - lr: 0.000003 - momentum: 0.000000 2023-10-23 15:03:36,282 epoch 10 - iter 81/275 - loss 0.00464933 - time (sec): 3.95 - samples/sec: 1678.84 - lr: 0.000002 - momentum: 0.000000 2023-10-23 15:03:37,576 epoch 10 - iter 108/275 - loss 0.00510011 - time (sec): 5.24 - samples/sec: 1693.47 - lr: 0.000002 - momentum: 0.000000 2023-10-23 15:03:38,869 epoch 10 - iter 135/275 - loss 0.00565368 - time (sec): 6.53 - samples/sec: 1698.88 - lr: 0.000002 - momentum: 0.000000 2023-10-23 15:03:40,197 epoch 10 - iter 162/275 - loss 0.00574751 - time (sec): 7.86 - samples/sec: 1748.46 - lr: 0.000001 - momentum: 0.000000 2023-10-23 15:03:41,524 epoch 10 - iter 189/275 - loss 0.00588541 - time (sec): 9.19 - samples/sec: 1771.29 - lr: 0.000001 - momentum: 0.000000 2023-10-23 15:03:42,825 epoch 10 - iter 216/275 - loss 0.00763093 - time (sec): 10.49 - samples/sec: 1752.62 - lr: 0.000001 - momentum: 0.000000 2023-10-23 15:03:44,145 epoch 10 - iter 243/275 - loss 0.00980050 - time (sec): 11.81 - samples/sec: 1729.03 - lr: 0.000000 - momentum: 0.000000 2023-10-23 15:03:45,443 epoch 10 - iter 270/275 - loss 0.01148583 - time (sec): 13.11 - samples/sec: 1709.20 - lr: 0.000000 - momentum: 0.000000 2023-10-23 15:03:45,696 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:03:45,696 EPOCH 10 done: loss 0.0113 - lr: 0.000000 2023-10-23 15:03:46,229 DEV : loss 0.15140938758850098 - f1-score (micro avg) 0.8935 2023-10-23 15:03:46,636 ---------------------------------------------------------------------------------------------------- 2023-10-23 15:03:46,637 Loading model from best epoch ... 2023-10-23 15:03:48,383 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-23 15:03:49,046 Results: - F-score (micro) 0.9227 - F-score (macro) 0.7845 - Accuracy 0.8691 By class: precision recall f1-score support scope 0.9157 0.9261 0.9209 176 pers 0.9839 0.9531 0.9683 128 work 0.8553 0.8784 0.8667 74 object 0.5000 0.5000 0.5000 2 loc 1.0000 0.5000 0.6667 2 micro avg 0.9239 0.9215 0.9227 382 macro avg 0.8510 0.7515 0.7845 382 weighted avg 0.9251 0.9215 0.9227 382 2023-10-23 15:03:49,046 ----------------------------------------------------------------------------------------------------