stefan-it's picture
Upload folder using huggingface_hub
2aee4f7
raw
history blame
23.9 kB
2023-10-17 00:02:15,413 ----------------------------------------------------------------------------------------------------
2023-10-17 00:02:15,414 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 00:02:15,414 ----------------------------------------------------------------------------------------------------
2023-10-17 00:02:15,414 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-17 00:02:15,414 ----------------------------------------------------------------------------------------------------
2023-10-17 00:02:15,414 Train: 6183 sentences
2023-10-17 00:02:15,414 (train_with_dev=False, train_with_test=False)
2023-10-17 00:02:15,414 ----------------------------------------------------------------------------------------------------
2023-10-17 00:02:15,415 Training Params:
2023-10-17 00:02:15,415 - learning_rate: "3e-05"
2023-10-17 00:02:15,415 - mini_batch_size: "8"
2023-10-17 00:02:15,415 - max_epochs: "10"
2023-10-17 00:02:15,415 - shuffle: "True"
2023-10-17 00:02:15,415 ----------------------------------------------------------------------------------------------------
2023-10-17 00:02:15,415 Plugins:
2023-10-17 00:02:15,415 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 00:02:15,415 ----------------------------------------------------------------------------------------------------
2023-10-17 00:02:15,415 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 00:02:15,415 - metric: "('micro avg', 'f1-score')"
2023-10-17 00:02:15,415 ----------------------------------------------------------------------------------------------------
2023-10-17 00:02:15,415 Computation:
2023-10-17 00:02:15,415 - compute on device: cuda:0
2023-10-17 00:02:15,415 - embedding storage: none
2023-10-17 00:02:15,415 ----------------------------------------------------------------------------------------------------
2023-10-17 00:02:15,415 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 00:02:15,415 ----------------------------------------------------------------------------------------------------
2023-10-17 00:02:15,415 ----------------------------------------------------------------------------------------------------
2023-10-17 00:02:19,821 epoch 1 - iter 77/773 - loss 2.32537798 - time (sec): 4.40 - samples/sec: 2923.31 - lr: 0.000003 - momentum: 0.000000
2023-10-17 00:02:24,181 epoch 1 - iter 154/773 - loss 1.39644595 - time (sec): 8.76 - samples/sec: 2910.79 - lr: 0.000006 - momentum: 0.000000
2023-10-17 00:02:28,671 epoch 1 - iter 231/773 - loss 1.01126391 - time (sec): 13.25 - samples/sec: 2847.50 - lr: 0.000009 - momentum: 0.000000
2023-10-17 00:02:33,108 epoch 1 - iter 308/773 - loss 0.80543336 - time (sec): 17.69 - samples/sec: 2821.83 - lr: 0.000012 - momentum: 0.000000
2023-10-17 00:02:37,725 epoch 1 - iter 385/773 - loss 0.66957136 - time (sec): 22.31 - samples/sec: 2803.17 - lr: 0.000015 - momentum: 0.000000
2023-10-17 00:02:42,147 epoch 1 - iter 462/773 - loss 0.58326545 - time (sec): 26.73 - samples/sec: 2778.15 - lr: 0.000018 - momentum: 0.000000
2023-10-17 00:02:46,740 epoch 1 - iter 539/773 - loss 0.51629996 - time (sec): 31.32 - samples/sec: 2754.64 - lr: 0.000021 - momentum: 0.000000
2023-10-17 00:02:51,150 epoch 1 - iter 616/773 - loss 0.46530910 - time (sec): 35.73 - samples/sec: 2756.33 - lr: 0.000024 - momentum: 0.000000
2023-10-17 00:02:55,889 epoch 1 - iter 693/773 - loss 0.42285036 - time (sec): 40.47 - samples/sec: 2748.13 - lr: 0.000027 - momentum: 0.000000
2023-10-17 00:03:00,302 epoch 1 - iter 770/773 - loss 0.38822438 - time (sec): 44.89 - samples/sec: 2760.80 - lr: 0.000030 - momentum: 0.000000
2023-10-17 00:03:00,451 ----------------------------------------------------------------------------------------------------
2023-10-17 00:03:00,451 EPOCH 1 done: loss 0.3872 - lr: 0.000030
2023-10-17 00:03:02,192 DEV : loss 0.05650660768151283 - f1-score (micro avg) 0.6938
2023-10-17 00:03:02,205 saving best model
2023-10-17 00:03:02,539 ----------------------------------------------------------------------------------------------------
2023-10-17 00:03:07,258 epoch 2 - iter 77/773 - loss 0.08489853 - time (sec): 4.72 - samples/sec: 2802.48 - lr: 0.000030 - momentum: 0.000000
2023-10-17 00:03:11,873 epoch 2 - iter 154/773 - loss 0.08347053 - time (sec): 9.33 - samples/sec: 2766.52 - lr: 0.000029 - momentum: 0.000000
2023-10-17 00:03:16,356 epoch 2 - iter 231/773 - loss 0.08160898 - time (sec): 13.82 - samples/sec: 2757.91 - lr: 0.000029 - momentum: 0.000000
2023-10-17 00:03:20,717 epoch 2 - iter 308/773 - loss 0.08349640 - time (sec): 18.18 - samples/sec: 2752.10 - lr: 0.000029 - momentum: 0.000000
2023-10-17 00:03:25,154 epoch 2 - iter 385/773 - loss 0.08202306 - time (sec): 22.61 - samples/sec: 2720.25 - lr: 0.000028 - momentum: 0.000000
2023-10-17 00:03:29,566 epoch 2 - iter 462/773 - loss 0.08138239 - time (sec): 27.03 - samples/sec: 2739.59 - lr: 0.000028 - momentum: 0.000000
2023-10-17 00:03:34,029 epoch 2 - iter 539/773 - loss 0.08140479 - time (sec): 31.49 - samples/sec: 2754.29 - lr: 0.000028 - momentum: 0.000000
2023-10-17 00:03:38,872 epoch 2 - iter 616/773 - loss 0.07747793 - time (sec): 36.33 - samples/sec: 2743.79 - lr: 0.000027 - momentum: 0.000000
2023-10-17 00:03:43,274 epoch 2 - iter 693/773 - loss 0.07718394 - time (sec): 40.73 - samples/sec: 2736.23 - lr: 0.000027 - momentum: 0.000000
2023-10-17 00:03:47,771 epoch 2 - iter 770/773 - loss 0.07694589 - time (sec): 45.23 - samples/sec: 2741.04 - lr: 0.000027 - momentum: 0.000000
2023-10-17 00:03:47,914 ----------------------------------------------------------------------------------------------------
2023-10-17 00:03:47,915 EPOCH 2 done: loss 0.0769 - lr: 0.000027
2023-10-17 00:03:50,333 DEV : loss 0.0547206737101078 - f1-score (micro avg) 0.7759
2023-10-17 00:03:50,346 saving best model
2023-10-17 00:03:50,814 ----------------------------------------------------------------------------------------------------
2023-10-17 00:03:55,393 epoch 3 - iter 77/773 - loss 0.04187396 - time (sec): 4.58 - samples/sec: 2791.16 - lr: 0.000026 - momentum: 0.000000
2023-10-17 00:04:00,079 epoch 3 - iter 154/773 - loss 0.05682215 - time (sec): 9.26 - samples/sec: 2789.17 - lr: 0.000026 - momentum: 0.000000
2023-10-17 00:04:04,789 epoch 3 - iter 231/773 - loss 0.05388452 - time (sec): 13.97 - samples/sec: 2803.36 - lr: 0.000026 - momentum: 0.000000
2023-10-17 00:04:09,173 epoch 3 - iter 308/773 - loss 0.05046208 - time (sec): 18.36 - samples/sec: 2767.82 - lr: 0.000025 - momentum: 0.000000
2023-10-17 00:04:13,610 epoch 3 - iter 385/773 - loss 0.04968648 - time (sec): 22.79 - samples/sec: 2757.33 - lr: 0.000025 - momentum: 0.000000
2023-10-17 00:04:18,042 epoch 3 - iter 462/773 - loss 0.04971432 - time (sec): 27.23 - samples/sec: 2743.24 - lr: 0.000025 - momentum: 0.000000
2023-10-17 00:04:22,720 epoch 3 - iter 539/773 - loss 0.04992038 - time (sec): 31.90 - samples/sec: 2749.36 - lr: 0.000024 - momentum: 0.000000
2023-10-17 00:04:27,265 epoch 3 - iter 616/773 - loss 0.04910921 - time (sec): 36.45 - samples/sec: 2740.05 - lr: 0.000024 - momentum: 0.000000
2023-10-17 00:04:31,575 epoch 3 - iter 693/773 - loss 0.04872624 - time (sec): 40.76 - samples/sec: 2733.99 - lr: 0.000024 - momentum: 0.000000
2023-10-17 00:04:36,034 epoch 3 - iter 770/773 - loss 0.04859388 - time (sec): 45.22 - samples/sec: 2739.46 - lr: 0.000023 - momentum: 0.000000
2023-10-17 00:04:36,180 ----------------------------------------------------------------------------------------------------
2023-10-17 00:04:36,181 EPOCH 3 done: loss 0.0485 - lr: 0.000023
2023-10-17 00:04:38,322 DEV : loss 0.06617607176303864 - f1-score (micro avg) 0.7778
2023-10-17 00:04:38,335 saving best model
2023-10-17 00:04:38,793 ----------------------------------------------------------------------------------------------------
2023-10-17 00:04:43,088 epoch 4 - iter 77/773 - loss 0.03283229 - time (sec): 4.29 - samples/sec: 2716.07 - lr: 0.000023 - momentum: 0.000000
2023-10-17 00:04:47,633 epoch 4 - iter 154/773 - loss 0.02889967 - time (sec): 8.83 - samples/sec: 2678.88 - lr: 0.000023 - momentum: 0.000000
2023-10-17 00:04:52,242 epoch 4 - iter 231/773 - loss 0.03012282 - time (sec): 13.44 - samples/sec: 2702.53 - lr: 0.000022 - momentum: 0.000000
2023-10-17 00:04:56,714 epoch 4 - iter 308/773 - loss 0.02887586 - time (sec): 17.92 - samples/sec: 2711.46 - lr: 0.000022 - momentum: 0.000000
2023-10-17 00:05:01,312 epoch 4 - iter 385/773 - loss 0.03075624 - time (sec): 22.51 - samples/sec: 2701.08 - lr: 0.000022 - momentum: 0.000000
2023-10-17 00:05:05,714 epoch 4 - iter 462/773 - loss 0.03144270 - time (sec): 26.92 - samples/sec: 2704.90 - lr: 0.000021 - momentum: 0.000000
2023-10-17 00:05:10,256 epoch 4 - iter 539/773 - loss 0.03187058 - time (sec): 31.46 - samples/sec: 2716.26 - lr: 0.000021 - momentum: 0.000000
2023-10-17 00:05:14,768 epoch 4 - iter 616/773 - loss 0.03220682 - time (sec): 35.97 - samples/sec: 2715.62 - lr: 0.000021 - momentum: 0.000000
2023-10-17 00:05:19,176 epoch 4 - iter 693/773 - loss 0.03218910 - time (sec): 40.38 - samples/sec: 2735.10 - lr: 0.000020 - momentum: 0.000000
2023-10-17 00:05:23,884 epoch 4 - iter 770/773 - loss 0.03176330 - time (sec): 45.09 - samples/sec: 2743.13 - lr: 0.000020 - momentum: 0.000000
2023-10-17 00:05:24,069 ----------------------------------------------------------------------------------------------------
2023-10-17 00:05:24,069 EPOCH 4 done: loss 0.0318 - lr: 0.000020
2023-10-17 00:05:26,248 DEV : loss 0.08043687045574188 - f1-score (micro avg) 0.8008
2023-10-17 00:05:26,261 saving best model
2023-10-17 00:05:26,692 ----------------------------------------------------------------------------------------------------
2023-10-17 00:05:31,172 epoch 5 - iter 77/773 - loss 0.02544472 - time (sec): 4.48 - samples/sec: 2766.54 - lr: 0.000020 - momentum: 0.000000
2023-10-17 00:05:35,661 epoch 5 - iter 154/773 - loss 0.02257079 - time (sec): 8.97 - samples/sec: 2720.45 - lr: 0.000019 - momentum: 0.000000
2023-10-17 00:05:40,059 epoch 5 - iter 231/773 - loss 0.02161194 - time (sec): 13.36 - samples/sec: 2755.55 - lr: 0.000019 - momentum: 0.000000
2023-10-17 00:05:44,332 epoch 5 - iter 308/773 - loss 0.02181905 - time (sec): 17.64 - samples/sec: 2788.07 - lr: 0.000019 - momentum: 0.000000
2023-10-17 00:05:49,054 epoch 5 - iter 385/773 - loss 0.02217563 - time (sec): 22.36 - samples/sec: 2775.13 - lr: 0.000018 - momentum: 0.000000
2023-10-17 00:05:53,652 epoch 5 - iter 462/773 - loss 0.02354034 - time (sec): 26.96 - samples/sec: 2746.15 - lr: 0.000018 - momentum: 0.000000
2023-10-17 00:05:58,250 epoch 5 - iter 539/773 - loss 0.02344632 - time (sec): 31.55 - samples/sec: 2774.95 - lr: 0.000018 - momentum: 0.000000
2023-10-17 00:06:02,708 epoch 5 - iter 616/773 - loss 0.02222092 - time (sec): 36.01 - samples/sec: 2760.67 - lr: 0.000017 - momentum: 0.000000
2023-10-17 00:06:07,215 epoch 5 - iter 693/773 - loss 0.02187424 - time (sec): 40.52 - samples/sec: 2757.91 - lr: 0.000017 - momentum: 0.000000
2023-10-17 00:06:11,843 epoch 5 - iter 770/773 - loss 0.02196420 - time (sec): 45.15 - samples/sec: 2746.18 - lr: 0.000017 - momentum: 0.000000
2023-10-17 00:06:11,993 ----------------------------------------------------------------------------------------------------
2023-10-17 00:06:11,994 EPOCH 5 done: loss 0.0219 - lr: 0.000017
2023-10-17 00:06:14,045 DEV : loss 0.09281734377145767 - f1-score (micro avg) 0.7984
2023-10-17 00:06:14,058 ----------------------------------------------------------------------------------------------------
2023-10-17 00:06:18,476 epoch 6 - iter 77/773 - loss 0.01754628 - time (sec): 4.42 - samples/sec: 2851.07 - lr: 0.000016 - momentum: 0.000000
2023-10-17 00:06:23,103 epoch 6 - iter 154/773 - loss 0.01478913 - time (sec): 9.04 - samples/sec: 2805.88 - lr: 0.000016 - momentum: 0.000000
2023-10-17 00:06:27,628 epoch 6 - iter 231/773 - loss 0.01586836 - time (sec): 13.57 - samples/sec: 2737.88 - lr: 0.000016 - momentum: 0.000000
2023-10-17 00:06:32,019 epoch 6 - iter 308/773 - loss 0.01810026 - time (sec): 17.96 - samples/sec: 2764.63 - lr: 0.000015 - momentum: 0.000000
2023-10-17 00:06:36,634 epoch 6 - iter 385/773 - loss 0.01805116 - time (sec): 22.58 - samples/sec: 2764.74 - lr: 0.000015 - momentum: 0.000000
2023-10-17 00:06:41,164 epoch 6 - iter 462/773 - loss 0.01877742 - time (sec): 27.10 - samples/sec: 2739.87 - lr: 0.000015 - momentum: 0.000000
2023-10-17 00:06:45,844 epoch 6 - iter 539/773 - loss 0.01719607 - time (sec): 31.79 - samples/sec: 2734.19 - lr: 0.000014 - momentum: 0.000000
2023-10-17 00:06:50,351 epoch 6 - iter 616/773 - loss 0.01753376 - time (sec): 36.29 - samples/sec: 2692.58 - lr: 0.000014 - momentum: 0.000000
2023-10-17 00:06:55,337 epoch 6 - iter 693/773 - loss 0.01746324 - time (sec): 41.28 - samples/sec: 2680.67 - lr: 0.000014 - momentum: 0.000000
2023-10-17 00:06:59,848 epoch 6 - iter 770/773 - loss 0.01698989 - time (sec): 45.79 - samples/sec: 2707.29 - lr: 0.000013 - momentum: 0.000000
2023-10-17 00:07:00,006 ----------------------------------------------------------------------------------------------------
2023-10-17 00:07:00,006 EPOCH 6 done: loss 0.0171 - lr: 0.000013
2023-10-17 00:07:02,032 DEV : loss 0.0988919660449028 - f1-score (micro avg) 0.7862
2023-10-17 00:07:02,046 ----------------------------------------------------------------------------------------------------
2023-10-17 00:07:06,337 epoch 7 - iter 77/773 - loss 0.01237299 - time (sec): 4.29 - samples/sec: 2694.78 - lr: 0.000013 - momentum: 0.000000
2023-10-17 00:07:10,716 epoch 7 - iter 154/773 - loss 0.01304177 - time (sec): 8.67 - samples/sec: 2680.44 - lr: 0.000013 - momentum: 0.000000
2023-10-17 00:07:15,149 epoch 7 - iter 231/773 - loss 0.01266367 - time (sec): 13.10 - samples/sec: 2681.15 - lr: 0.000012 - momentum: 0.000000
2023-10-17 00:07:19,827 epoch 7 - iter 308/773 - loss 0.01099661 - time (sec): 17.78 - samples/sec: 2701.77 - lr: 0.000012 - momentum: 0.000000
2023-10-17 00:07:24,501 epoch 7 - iter 385/773 - loss 0.01065385 - time (sec): 22.45 - samples/sec: 2710.31 - lr: 0.000012 - momentum: 0.000000
2023-10-17 00:07:29,332 epoch 7 - iter 462/773 - loss 0.01100561 - time (sec): 27.29 - samples/sec: 2701.80 - lr: 0.000011 - momentum: 0.000000
2023-10-17 00:07:33,792 epoch 7 - iter 539/773 - loss 0.01140856 - time (sec): 31.74 - samples/sec: 2737.64 - lr: 0.000011 - momentum: 0.000000
2023-10-17 00:07:38,159 epoch 7 - iter 616/773 - loss 0.01149250 - time (sec): 36.11 - samples/sec: 2752.33 - lr: 0.000011 - momentum: 0.000000
2023-10-17 00:07:42,551 epoch 7 - iter 693/773 - loss 0.01180606 - time (sec): 40.50 - samples/sec: 2746.82 - lr: 0.000010 - momentum: 0.000000
2023-10-17 00:07:47,066 epoch 7 - iter 770/773 - loss 0.01156637 - time (sec): 45.02 - samples/sec: 2751.41 - lr: 0.000010 - momentum: 0.000000
2023-10-17 00:07:47,241 ----------------------------------------------------------------------------------------------------
2023-10-17 00:07:47,241 EPOCH 7 done: loss 0.0115 - lr: 0.000010
2023-10-17 00:07:49,330 DEV : loss 0.10134067386388779 - f1-score (micro avg) 0.8057
2023-10-17 00:07:49,343 saving best model
2023-10-17 00:07:49,797 ----------------------------------------------------------------------------------------------------
2023-10-17 00:07:54,435 epoch 8 - iter 77/773 - loss 0.01014825 - time (sec): 4.64 - samples/sec: 2668.24 - lr: 0.000010 - momentum: 0.000000
2023-10-17 00:07:59,271 epoch 8 - iter 154/773 - loss 0.00874093 - time (sec): 9.47 - samples/sec: 2726.47 - lr: 0.000009 - momentum: 0.000000
2023-10-17 00:08:03,768 epoch 8 - iter 231/773 - loss 0.00992027 - time (sec): 13.97 - samples/sec: 2711.53 - lr: 0.000009 - momentum: 0.000000
2023-10-17 00:08:08,281 epoch 8 - iter 308/773 - loss 0.00892334 - time (sec): 18.48 - samples/sec: 2743.99 - lr: 0.000009 - momentum: 0.000000
2023-10-17 00:08:12,957 epoch 8 - iter 385/773 - loss 0.00881416 - time (sec): 23.16 - samples/sec: 2741.93 - lr: 0.000008 - momentum: 0.000000
2023-10-17 00:08:17,660 epoch 8 - iter 462/773 - loss 0.00832352 - time (sec): 27.86 - samples/sec: 2744.83 - lr: 0.000008 - momentum: 0.000000
2023-10-17 00:08:22,079 epoch 8 - iter 539/773 - loss 0.00829162 - time (sec): 32.28 - samples/sec: 2727.50 - lr: 0.000008 - momentum: 0.000000
2023-10-17 00:08:26,396 epoch 8 - iter 616/773 - loss 0.00850288 - time (sec): 36.60 - samples/sec: 2717.51 - lr: 0.000007 - momentum: 0.000000
2023-10-17 00:08:30,865 epoch 8 - iter 693/773 - loss 0.00806884 - time (sec): 41.07 - samples/sec: 2731.36 - lr: 0.000007 - momentum: 0.000000
2023-10-17 00:08:35,232 epoch 8 - iter 770/773 - loss 0.00809371 - time (sec): 45.43 - samples/sec: 2726.00 - lr: 0.000007 - momentum: 0.000000
2023-10-17 00:08:35,396 ----------------------------------------------------------------------------------------------------
2023-10-17 00:08:35,397 EPOCH 8 done: loss 0.0081 - lr: 0.000007
2023-10-17 00:08:37,446 DEV : loss 0.10502836853265762 - f1-score (micro avg) 0.8122
2023-10-17 00:08:37,459 saving best model
2023-10-17 00:08:37,910 ----------------------------------------------------------------------------------------------------
2023-10-17 00:08:42,371 epoch 9 - iter 77/773 - loss 0.01014486 - time (sec): 4.46 - samples/sec: 2743.83 - lr: 0.000006 - momentum: 0.000000
2023-10-17 00:08:46,899 epoch 9 - iter 154/773 - loss 0.00742422 - time (sec): 8.99 - samples/sec: 2812.20 - lr: 0.000006 - momentum: 0.000000
2023-10-17 00:08:51,665 epoch 9 - iter 231/773 - loss 0.00589126 - time (sec): 13.75 - samples/sec: 2791.71 - lr: 0.000006 - momentum: 0.000000
2023-10-17 00:08:56,063 epoch 9 - iter 308/773 - loss 0.00726357 - time (sec): 18.15 - samples/sec: 2778.16 - lr: 0.000005 - momentum: 0.000000
2023-10-17 00:09:00,644 epoch 9 - iter 385/773 - loss 0.00604410 - time (sec): 22.73 - samples/sec: 2758.51 - lr: 0.000005 - momentum: 0.000000
2023-10-17 00:09:05,214 epoch 9 - iter 462/773 - loss 0.00579631 - time (sec): 27.30 - samples/sec: 2744.61 - lr: 0.000005 - momentum: 0.000000
2023-10-17 00:09:09,519 epoch 9 - iter 539/773 - loss 0.00544690 - time (sec): 31.61 - samples/sec: 2733.80 - lr: 0.000004 - momentum: 0.000000
2023-10-17 00:09:13,933 epoch 9 - iter 616/773 - loss 0.00539373 - time (sec): 36.02 - samples/sec: 2753.50 - lr: 0.000004 - momentum: 0.000000
2023-10-17 00:09:18,546 epoch 9 - iter 693/773 - loss 0.00528883 - time (sec): 40.63 - samples/sec: 2748.24 - lr: 0.000004 - momentum: 0.000000
2023-10-17 00:09:22,935 epoch 9 - iter 770/773 - loss 0.00566848 - time (sec): 45.02 - samples/sec: 2748.09 - lr: 0.000003 - momentum: 0.000000
2023-10-17 00:09:23,109 ----------------------------------------------------------------------------------------------------
2023-10-17 00:09:23,109 EPOCH 9 done: loss 0.0056 - lr: 0.000003
2023-10-17 00:09:25,144 DEV : loss 0.11343234777450562 - f1-score (micro avg) 0.7934
2023-10-17 00:09:25,157 ----------------------------------------------------------------------------------------------------
2023-10-17 00:09:29,896 epoch 10 - iter 77/773 - loss 0.00231576 - time (sec): 4.74 - samples/sec: 2654.62 - lr: 0.000003 - momentum: 0.000000
2023-10-17 00:09:34,465 epoch 10 - iter 154/773 - loss 0.00249641 - time (sec): 9.31 - samples/sec: 2684.47 - lr: 0.000003 - momentum: 0.000000
2023-10-17 00:09:38,825 epoch 10 - iter 231/773 - loss 0.00374059 - time (sec): 13.67 - samples/sec: 2660.85 - lr: 0.000002 - momentum: 0.000000
2023-10-17 00:09:43,507 epoch 10 - iter 308/773 - loss 0.00401961 - time (sec): 18.35 - samples/sec: 2682.09 - lr: 0.000002 - momentum: 0.000000
2023-10-17 00:09:48,107 epoch 10 - iter 385/773 - loss 0.00388607 - time (sec): 22.95 - samples/sec: 2728.17 - lr: 0.000002 - momentum: 0.000000
2023-10-17 00:09:52,698 epoch 10 - iter 462/773 - loss 0.00364499 - time (sec): 27.54 - samples/sec: 2749.47 - lr: 0.000001 - momentum: 0.000000
2023-10-17 00:09:57,479 epoch 10 - iter 539/773 - loss 0.00343129 - time (sec): 32.32 - samples/sec: 2723.02 - lr: 0.000001 - momentum: 0.000000
2023-10-17 00:10:01,847 epoch 10 - iter 616/773 - loss 0.00340294 - time (sec): 36.69 - samples/sec: 2714.54 - lr: 0.000001 - momentum: 0.000000
2023-10-17 00:10:06,269 epoch 10 - iter 693/773 - loss 0.00353649 - time (sec): 41.11 - samples/sec: 2721.73 - lr: 0.000000 - momentum: 0.000000
2023-10-17 00:10:10,747 epoch 10 - iter 770/773 - loss 0.00363040 - time (sec): 45.59 - samples/sec: 2716.64 - lr: 0.000000 - momentum: 0.000000
2023-10-17 00:10:10,902 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:10,903 EPOCH 10 done: loss 0.0036 - lr: 0.000000
2023-10-17 00:10:12,932 DEV : loss 0.1106431633234024 - f1-score (micro avg) 0.7926
2023-10-17 00:10:13,311 ----------------------------------------------------------------------------------------------------
2023-10-17 00:10:13,312 Loading model from best epoch ...
2023-10-17 00:10:14,890 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-17 00:10:21,007
Results:
- F-score (micro) 0.7928
- F-score (macro) 0.6941
- Accuracy 0.6803
By class:
precision recall f1-score support
LOC 0.8269 0.8584 0.8423 946
BUILDING 0.5848 0.5405 0.5618 185
STREET 0.6610 0.6964 0.6783 56
micro avg 0.7847 0.8012 0.7928 1187
macro avg 0.6909 0.6984 0.6941 1187
weighted avg 0.7813 0.8012 0.7909 1187
2023-10-17 00:10:21,007 ----------------------------------------------------------------------------------------------------