2023-10-17 08:54:26,109 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:26,111 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 08:54:26,111 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:26,112 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-17 08:54:26,112 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:26,112 Train: 6183 sentences 2023-10-17 08:54:26,112 (train_with_dev=False, train_with_test=False) 2023-10-17 08:54:26,112 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:26,112 Training Params: 2023-10-17 08:54:26,112 - learning_rate: "3e-05" 2023-10-17 08:54:26,112 - mini_batch_size: "4" 2023-10-17 08:54:26,112 - max_epochs: "10" 2023-10-17 08:54:26,112 - shuffle: "True" 2023-10-17 08:54:26,112 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:26,112 Plugins: 2023-10-17 08:54:26,112 - TensorboardLogger 2023-10-17 08:54:26,112 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 08:54:26,113 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:26,113 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 08:54:26,113 - metric: "('micro avg', 'f1-score')" 2023-10-17 08:54:26,113 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:26,113 Computation: 2023-10-17 08:54:26,113 - compute on device: cuda:0 2023-10-17 08:54:26,113 - embedding storage: none 2023-10-17 08:54:26,113 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:26,113 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 08:54:26,113 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:26,113 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:54:26,113 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 08:54:38,761 epoch 1 - iter 154/1546 - loss 2.03813618 - time (sec): 12.65 - samples/sec: 1016.55 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:54:50,424 epoch 1 - iter 308/1546 - loss 1.16576737 - time (sec): 24.31 - samples/sec: 1031.97 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:55:01,974 epoch 1 - iter 462/1546 - loss 0.82780908 - time (sec): 35.86 - samples/sec: 1043.52 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:55:13,456 epoch 1 - iter 616/1546 - loss 0.64844751 - time (sec): 47.34 - samples/sec: 1064.97 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:55:26,298 epoch 1 - iter 770/1546 - loss 0.54447510 - time (sec): 60.18 - samples/sec: 1040.86 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:55:38,367 epoch 1 - iter 924/1546 - loss 0.47295779 - time (sec): 72.25 - samples/sec: 1037.01 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:55:50,285 epoch 1 - iter 1078/1546 - loss 0.43087146 - time (sec): 84.17 - samples/sec: 1030.26 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:56:01,700 epoch 1 - iter 1232/1546 - loss 0.39626057 - time (sec): 95.59 - samples/sec: 1033.71 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:56:13,219 epoch 1 - iter 1386/1546 - loss 0.36195701 - time (sec): 107.10 - samples/sec: 1041.77 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:56:25,365 epoch 1 - iter 1540/1546 - loss 0.33627435 - time (sec): 119.25 - samples/sec: 1039.81 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:56:25,820 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:56:25,821 EPOCH 1 done: loss 0.3357 - lr: 0.000030 2023-10-17 08:56:28,051 DEV : loss 0.06119954213500023 - f1-score (micro avg) 0.7417 2023-10-17 08:56:28,079 saving best model 2023-10-17 08:56:28,614 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:56:40,142 epoch 2 - iter 154/1546 - loss 0.10453700 - time (sec): 11.53 - samples/sec: 1025.22 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:56:52,687 epoch 2 - iter 308/1546 - loss 0.08704820 - time (sec): 24.07 - samples/sec: 1003.63 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:57:05,545 epoch 2 - iter 462/1546 - loss 0.08423983 - time (sec): 36.93 - samples/sec: 1020.87 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:57:17,766 epoch 2 - iter 616/1546 - loss 0.08543159 - time (sec): 49.15 - samples/sec: 1019.96 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:57:30,051 epoch 2 - iter 770/1546 - loss 0.08700069 - time (sec): 61.43 - samples/sec: 1021.39 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:57:42,483 epoch 2 - iter 924/1546 - loss 0.08706791 - time (sec): 73.87 - samples/sec: 1014.33 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:57:55,337 epoch 2 - iter 1078/1546 - loss 0.08512417 - time (sec): 86.72 - samples/sec: 1008.52 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:58:07,949 epoch 2 - iter 1232/1546 - loss 0.08386350 - time (sec): 99.33 - samples/sec: 1012.09 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:58:20,787 epoch 2 - iter 1386/1546 - loss 0.08241631 - time (sec): 112.17 - samples/sec: 1000.28 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:58:32,838 epoch 2 - iter 1540/1546 - loss 0.08294873 - time (sec): 124.22 - samples/sec: 998.40 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:58:33,293 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:58:33,293 EPOCH 2 done: loss 0.0831 - lr: 0.000027 2023-10-17 08:58:36,749 DEV : loss 0.06554654985666275 - f1-score (micro avg) 0.7308 2023-10-17 08:58:36,779 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:58:48,693 epoch 3 - iter 154/1546 - loss 0.05336694 - time (sec): 11.91 - samples/sec: 981.36 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:59:01,019 epoch 3 - iter 308/1546 - loss 0.05246151 - time (sec): 24.24 - samples/sec: 1024.44 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:59:13,098 epoch 3 - iter 462/1546 - loss 0.04970324 - time (sec): 36.32 - samples/sec: 1051.46 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:59:25,184 epoch 3 - iter 616/1546 - loss 0.04604475 - time (sec): 48.40 - samples/sec: 1045.28 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:59:37,027 epoch 3 - iter 770/1546 - loss 0.04734557 - time (sec): 60.25 - samples/sec: 1036.50 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:59:48,854 epoch 3 - iter 924/1546 - loss 0.04896834 - time (sec): 72.07 - samples/sec: 1041.64 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:00:00,733 epoch 3 - iter 1078/1546 - loss 0.05098204 - time (sec): 83.95 - samples/sec: 1038.22 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:00:13,133 epoch 3 - iter 1232/1546 - loss 0.05011942 - time (sec): 96.35 - samples/sec: 1032.97 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:00:25,611 epoch 3 - iter 1386/1546 - loss 0.05242302 - time (sec): 108.83 - samples/sec: 1013.79 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:00:37,699 epoch 3 - iter 1540/1546 - loss 0.05287108 - time (sec): 120.92 - samples/sec: 1024.65 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:00:38,164 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:00:38,164 EPOCH 3 done: loss 0.0528 - lr: 0.000023 2023-10-17 09:00:41,078 DEV : loss 0.06000832840800285 - f1-score (micro avg) 0.8048 2023-10-17 09:00:41,106 saving best model 2023-10-17 09:00:42,500 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:00:54,331 epoch 4 - iter 154/1546 - loss 0.03629651 - time (sec): 11.83 - samples/sec: 1087.02 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:01:06,052 epoch 4 - iter 308/1546 - loss 0.03234710 - time (sec): 23.55 - samples/sec: 1041.51 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:01:18,027 epoch 4 - iter 462/1546 - loss 0.03335157 - time (sec): 35.52 - samples/sec: 1056.04 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:01:30,170 epoch 4 - iter 616/1546 - loss 0.03256761 - time (sec): 47.67 - samples/sec: 1049.02 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:01:42,048 epoch 4 - iter 770/1546 - loss 0.03247698 - time (sec): 59.54 - samples/sec: 1048.24 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:01:53,943 epoch 4 - iter 924/1546 - loss 0.03387399 - time (sec): 71.44 - samples/sec: 1054.17 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:02:05,990 epoch 4 - iter 1078/1546 - loss 0.03389742 - time (sec): 83.49 - samples/sec: 1054.23 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:02:17,845 epoch 4 - iter 1232/1546 - loss 0.03395574 - time (sec): 95.34 - samples/sec: 1045.74 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:02:29,692 epoch 4 - iter 1386/1546 - loss 0.03458275 - time (sec): 107.19 - samples/sec: 1040.29 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:02:41,652 epoch 4 - iter 1540/1546 - loss 0.03469566 - time (sec): 119.15 - samples/sec: 1040.31 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:02:42,104 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:02:42,104 EPOCH 4 done: loss 0.0350 - lr: 0.000020 2023-10-17 09:02:44,928 DEV : loss 0.08604831993579865 - f1-score (micro avg) 0.7776 2023-10-17 09:02:44,958 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:02:57,022 epoch 5 - iter 154/1546 - loss 0.02284007 - time (sec): 12.06 - samples/sec: 983.65 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:03:09,189 epoch 5 - iter 308/1546 - loss 0.01803104 - time (sec): 24.23 - samples/sec: 1002.50 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:03:21,262 epoch 5 - iter 462/1546 - loss 0.01912710 - time (sec): 36.30 - samples/sec: 996.31 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:03:33,353 epoch 5 - iter 616/1546 - loss 0.01989178 - time (sec): 48.39 - samples/sec: 1000.35 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:03:45,469 epoch 5 - iter 770/1546 - loss 0.02252057 - time (sec): 60.51 - samples/sec: 1014.50 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:03:57,436 epoch 5 - iter 924/1546 - loss 0.02304384 - time (sec): 72.48 - samples/sec: 1021.05 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:04:09,542 epoch 5 - iter 1078/1546 - loss 0.02187336 - time (sec): 84.58 - samples/sec: 1022.70 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:04:21,529 epoch 5 - iter 1232/1546 - loss 0.02255478 - time (sec): 96.57 - samples/sec: 1021.76 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:04:33,945 epoch 5 - iter 1386/1546 - loss 0.02290087 - time (sec): 108.98 - samples/sec: 1026.82 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:04:46,517 epoch 5 - iter 1540/1546 - loss 0.02398018 - time (sec): 121.56 - samples/sec: 1018.04 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:04:47,010 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:04:47,010 EPOCH 5 done: loss 0.0242 - lr: 0.000017 2023-10-17 09:04:50,104 DEV : loss 0.09960237890481949 - f1-score (micro avg) 0.7876 2023-10-17 09:04:50,137 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:05:02,656 epoch 6 - iter 154/1546 - loss 0.01677459 - time (sec): 12.52 - samples/sec: 1024.75 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:05:14,857 epoch 6 - iter 308/1546 - loss 0.01372105 - time (sec): 24.72 - samples/sec: 1041.31 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:05:27,217 epoch 6 - iter 462/1546 - loss 0.01421126 - time (sec): 37.08 - samples/sec: 1025.57 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:05:39,731 epoch 6 - iter 616/1546 - loss 0.01600038 - time (sec): 49.59 - samples/sec: 1019.56 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:05:52,141 epoch 6 - iter 770/1546 - loss 0.01705876 - time (sec): 62.00 - samples/sec: 1024.70 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:06:04,991 epoch 6 - iter 924/1546 - loss 0.01729932 - time (sec): 74.85 - samples/sec: 1003.82 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:06:17,860 epoch 6 - iter 1078/1546 - loss 0.01687148 - time (sec): 87.72 - samples/sec: 990.68 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:06:30,814 epoch 6 - iter 1232/1546 - loss 0.01640763 - time (sec): 100.67 - samples/sec: 980.55 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:06:44,004 epoch 6 - iter 1386/1546 - loss 0.01674242 - time (sec): 113.86 - samples/sec: 978.03 - lr: 0.000014 - momentum: 0.000000 2023-10-17 09:06:57,512 epoch 6 - iter 1540/1546 - loss 0.01702449 - time (sec): 127.37 - samples/sec: 972.60 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:06:58,036 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:06:58,037 EPOCH 6 done: loss 0.0170 - lr: 0.000013 2023-10-17 09:07:00,854 DEV : loss 0.09961654990911484 - f1-score (micro avg) 0.7976 2023-10-17 09:07:00,882 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:07:14,298 epoch 7 - iter 154/1546 - loss 0.00495503 - time (sec): 13.41 - samples/sec: 873.42 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:07:26,536 epoch 7 - iter 308/1546 - loss 0.01087828 - time (sec): 25.65 - samples/sec: 925.23 - lr: 0.000013 - momentum: 0.000000 2023-10-17 09:07:38,433 epoch 7 - iter 462/1546 - loss 0.01334889 - time (sec): 37.55 - samples/sec: 964.89 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:07:50,256 epoch 7 - iter 616/1546 - loss 0.01321695 - time (sec): 49.37 - samples/sec: 991.28 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:08:01,813 epoch 7 - iter 770/1546 - loss 0.01313843 - time (sec): 60.93 - samples/sec: 1010.71 - lr: 0.000012 - momentum: 0.000000 2023-10-17 09:08:13,352 epoch 7 - iter 924/1546 - loss 0.01154668 - time (sec): 72.47 - samples/sec: 1020.16 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:08:24,980 epoch 7 - iter 1078/1546 - loss 0.01103876 - time (sec): 84.10 - samples/sec: 1022.36 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:08:36,738 epoch 7 - iter 1232/1546 - loss 0.01096499 - time (sec): 95.85 - samples/sec: 1033.68 - lr: 0.000011 - momentum: 0.000000 2023-10-17 09:08:48,467 epoch 7 - iter 1386/1546 - loss 0.01135222 - time (sec): 107.58 - samples/sec: 1039.84 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:09:01,265 epoch 7 - iter 1540/1546 - loss 0.01229847 - time (sec): 120.38 - samples/sec: 1027.48 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:09:01,773 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:09:01,774 EPOCH 7 done: loss 0.0122 - lr: 0.000010 2023-10-17 09:09:04,924 DEV : loss 0.10686086863279343 - f1-score (micro avg) 0.8068 2023-10-17 09:09:04,958 saving best model 2023-10-17 09:09:06,388 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:09:18,894 epoch 8 - iter 154/1546 - loss 0.00730795 - time (sec): 12.50 - samples/sec: 990.35 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:09:31,198 epoch 8 - iter 308/1546 - loss 0.00668288 - time (sec): 24.80 - samples/sec: 1018.51 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:09:43,852 epoch 8 - iter 462/1546 - loss 0.00776588 - time (sec): 37.46 - samples/sec: 997.70 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:09:56,384 epoch 8 - iter 616/1546 - loss 0.00746374 - time (sec): 49.99 - samples/sec: 990.02 - lr: 0.000009 - momentum: 0.000000 2023-10-17 09:10:08,876 epoch 8 - iter 770/1546 - loss 0.00675501 - time (sec): 62.48 - samples/sec: 984.51 - lr: 0.000008 - momentum: 0.000000 2023-10-17 09:10:21,661 epoch 8 - iter 924/1546 - loss 0.00715765 - time (sec): 75.27 - samples/sec: 991.92 - lr: 0.000008 - momentum: 0.000000 2023-10-17 09:10:34,171 epoch 8 - iter 1078/1546 - loss 0.00694728 - time (sec): 87.78 - samples/sec: 998.22 - lr: 0.000008 - momentum: 0.000000 2023-10-17 09:10:46,699 epoch 8 - iter 1232/1546 - loss 0.00699978 - time (sec): 100.30 - samples/sec: 991.84 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:10:58,637 epoch 8 - iter 1386/1546 - loss 0.00720570 - time (sec): 112.24 - samples/sec: 988.67 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:11:10,512 epoch 8 - iter 1540/1546 - loss 0.00770745 - time (sec): 124.12 - samples/sec: 998.57 - lr: 0.000007 - momentum: 0.000000 2023-10-17 09:11:10,965 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:11:10,965 EPOCH 8 done: loss 0.0077 - lr: 0.000007 2023-10-17 09:11:13,892 DEV : loss 0.10704014450311661 - f1-score (micro avg) 0.8 2023-10-17 09:11:13,922 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:11:25,981 epoch 9 - iter 154/1546 - loss 0.00378941 - time (sec): 12.05 - samples/sec: 1045.81 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:11:37,802 epoch 9 - iter 308/1546 - loss 0.00283362 - time (sec): 23.88 - samples/sec: 1027.32 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:11:50,093 epoch 9 - iter 462/1546 - loss 0.00396993 - time (sec): 36.17 - samples/sec: 1034.15 - lr: 0.000006 - momentum: 0.000000 2023-10-17 09:12:03,236 epoch 9 - iter 616/1546 - loss 0.00400059 - time (sec): 49.31 - samples/sec: 996.36 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:12:15,466 epoch 9 - iter 770/1546 - loss 0.00385112 - time (sec): 61.54 - samples/sec: 1005.67 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:12:27,700 epoch 9 - iter 924/1546 - loss 0.00459493 - time (sec): 73.77 - samples/sec: 1003.17 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:12:39,832 epoch 9 - iter 1078/1546 - loss 0.00412718 - time (sec): 85.91 - samples/sec: 1011.44 - lr: 0.000004 - momentum: 0.000000 2023-10-17 09:12:51,879 epoch 9 - iter 1232/1546 - loss 0.00405342 - time (sec): 97.95 - samples/sec: 1010.84 - lr: 0.000004 - momentum: 0.000000 2023-10-17 09:13:03,846 epoch 9 - iter 1386/1546 - loss 0.00404944 - time (sec): 109.92 - samples/sec: 1021.84 - lr: 0.000004 - momentum: 0.000000 2023-10-17 09:13:15,705 epoch 9 - iter 1540/1546 - loss 0.00452713 - time (sec): 121.78 - samples/sec: 1016.74 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:13:16,160 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:13:16,160 EPOCH 9 done: loss 0.0045 - lr: 0.000003 2023-10-17 09:13:18,878 DEV : loss 0.12154770642518997 - f1-score (micro avg) 0.7983 2023-10-17 09:13:18,904 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:13:30,793 epoch 10 - iter 154/1546 - loss 0.00274083 - time (sec): 11.89 - samples/sec: 1053.40 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:13:42,788 epoch 10 - iter 308/1546 - loss 0.00390665 - time (sec): 23.88 - samples/sec: 1038.05 - lr: 0.000003 - momentum: 0.000000 2023-10-17 09:13:54,809 epoch 10 - iter 462/1546 - loss 0.00330672 - time (sec): 35.90 - samples/sec: 1054.09 - lr: 0.000002 - momentum: 0.000000 2023-10-17 09:14:07,461 epoch 10 - iter 616/1546 - loss 0.00342831 - time (sec): 48.56 - samples/sec: 1035.09 - lr: 0.000002 - momentum: 0.000000 2023-10-17 09:14:20,606 epoch 10 - iter 770/1546 - loss 0.00329853 - time (sec): 61.70 - samples/sec: 1013.00 - lr: 0.000002 - momentum: 0.000000 2023-10-17 09:14:33,362 epoch 10 - iter 924/1546 - loss 0.00299964 - time (sec): 74.46 - samples/sec: 996.99 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:14:46,754 epoch 10 - iter 1078/1546 - loss 0.00313900 - time (sec): 87.85 - samples/sec: 990.71 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:14:59,125 epoch 10 - iter 1232/1546 - loss 0.00324827 - time (sec): 100.22 - samples/sec: 986.19 - lr: 0.000001 - momentum: 0.000000 2023-10-17 09:15:12,274 epoch 10 - iter 1386/1546 - loss 0.00311186 - time (sec): 113.37 - samples/sec: 983.00 - lr: 0.000000 - momentum: 0.000000 2023-10-17 09:15:25,848 epoch 10 - iter 1540/1546 - loss 0.00341137 - time (sec): 126.94 - samples/sec: 975.57 - lr: 0.000000 - momentum: 0.000000 2023-10-17 09:15:26,351 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:15:26,351 EPOCH 10 done: loss 0.0034 - lr: 0.000000 2023-10-17 09:15:29,674 DEV : loss 0.11997128278017044 - f1-score (micro avg) 0.7886 2023-10-17 09:15:30,263 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:15:30,265 Loading model from best epoch ... 2023-10-17 09:15:32,732 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-17 09:15:40,874 Results: - F-score (micro) 0.8096 - F-score (macro) 0.7186 - Accuracy 0.6984 By class: precision recall f1-score support LOC 0.8731 0.8362 0.8542 946 BUILDING 0.6806 0.5297 0.5957 185 STREET 0.6667 0.7500 0.7059 56 micro avg 0.8365 0.7843 0.8096 1187 macro avg 0.7401 0.7053 0.7186 1187 weighted avg 0.8333 0.7843 0.8069 1187 2023-10-17 09:15:40,875 ----------------------------------------------------------------------------------------------------