stefan-it's picture
Upload ./training.log with huggingface_hub
e9bce1b
raw
history blame
24.3 kB
2023-11-16 00:45:31,082 ----------------------------------------------------------------------------------------------------
2023-11-16 00:45:31,084 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): XLMRobertaModel(
(embeddings): XLMRobertaEmbeddings(
(word_embeddings): Embedding(250003, 1024)
(position_embeddings): Embedding(514, 1024, padding_idx=1)
(token_type_embeddings): Embedding(1, 1024)
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): XLMRobertaEncoder(
(layer): ModuleList(
(0-23): 24 x XLMRobertaLayer(
(attention): XLMRobertaAttention(
(self): XLMRobertaSelfAttention(
(query): Linear(in_features=1024, out_features=1024, bias=True)
(key): Linear(in_features=1024, out_features=1024, bias=True)
(value): Linear(in_features=1024, out_features=1024, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): XLMRobertaSelfOutput(
(dense): Linear(in_features=1024, out_features=1024, bias=True)
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): XLMRobertaIntermediate(
(dense): Linear(in_features=1024, out_features=4096, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): XLMRobertaOutput(
(dense): Linear(in_features=4096, out_features=1024, bias=True)
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): XLMRobertaPooler(
(dense): Linear(in_features=1024, out_features=1024, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1024, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-11-16 00:45:31,084 ----------------------------------------------------------------------------------------------------
2023-11-16 00:45:31,084 MultiCorpus: 30000 train + 10000 dev + 10000 test sentences
- ColumnCorpus Corpus: 20000 train + 0 dev + 0 test sentences - /root/.flair/datasets/ner_multi_xtreme/en
- ColumnCorpus Corpus: 10000 train + 10000 dev + 10000 test sentences - /root/.flair/datasets/ner_multi_xtreme/ka
2023-11-16 00:45:31,084 ----------------------------------------------------------------------------------------------------
2023-11-16 00:45:31,084 Train: 30000 sentences
2023-11-16 00:45:31,084 (train_with_dev=False, train_with_test=False)
2023-11-16 00:45:31,084 ----------------------------------------------------------------------------------------------------
2023-11-16 00:45:31,084 Training Params:
2023-11-16 00:45:31,084 - learning_rate: "5e-06"
2023-11-16 00:45:31,084 - mini_batch_size: "4"
2023-11-16 00:45:31,084 - max_epochs: "10"
2023-11-16 00:45:31,084 - shuffle: "True"
2023-11-16 00:45:31,084 ----------------------------------------------------------------------------------------------------
2023-11-16 00:45:31,084 Plugins:
2023-11-16 00:45:31,084 - TensorboardLogger
2023-11-16 00:45:31,084 - LinearScheduler | warmup_fraction: '0.1'
2023-11-16 00:45:31,084 ----------------------------------------------------------------------------------------------------
2023-11-16 00:45:31,084 Final evaluation on model from best epoch (best-model.pt)
2023-11-16 00:45:31,085 - metric: "('micro avg', 'f1-score')"
2023-11-16 00:45:31,085 ----------------------------------------------------------------------------------------------------
2023-11-16 00:45:31,085 Computation:
2023-11-16 00:45:31,085 - compute on device: cuda:0
2023-11-16 00:45:31,085 - embedding storage: none
2023-11-16 00:45:31,085 ----------------------------------------------------------------------------------------------------
2023-11-16 00:45:31,085 Model training base path: "autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-2"
2023-11-16 00:45:31,085 ----------------------------------------------------------------------------------------------------
2023-11-16 00:45:31,085 ----------------------------------------------------------------------------------------------------
2023-11-16 00:45:31,085 Logging anything other than scalars to TensorBoard is currently not supported.
2023-11-16 00:47:04,086 epoch 1 - iter 750/7500 - loss 3.20421711 - time (sec): 93.00 - samples/sec: 254.76 - lr: 0.000000 - momentum: 0.000000
2023-11-16 00:48:36,937 epoch 1 - iter 1500/7500 - loss 2.53046773 - time (sec): 185.85 - samples/sec: 256.83 - lr: 0.000001 - momentum: 0.000000
2023-11-16 00:50:07,872 epoch 1 - iter 2250/7500 - loss 2.16462133 - time (sec): 276.79 - samples/sec: 258.95 - lr: 0.000001 - momentum: 0.000000
2023-11-16 00:51:39,794 epoch 1 - iter 3000/7500 - loss 1.89258823 - time (sec): 368.71 - samples/sec: 258.86 - lr: 0.000002 - momentum: 0.000000
2023-11-16 00:53:11,581 epoch 1 - iter 3750/7500 - loss 1.66150429 - time (sec): 460.49 - samples/sec: 259.63 - lr: 0.000002 - momentum: 0.000000
2023-11-16 00:54:42,341 epoch 1 - iter 4500/7500 - loss 1.47974045 - time (sec): 551.25 - samples/sec: 261.00 - lr: 0.000003 - momentum: 0.000000
2023-11-16 00:56:14,348 epoch 1 - iter 5250/7500 - loss 1.33779802 - time (sec): 643.26 - samples/sec: 261.74 - lr: 0.000003 - momentum: 0.000000
2023-11-16 00:57:44,891 epoch 1 - iter 6000/7500 - loss 1.23224942 - time (sec): 733.80 - samples/sec: 262.47 - lr: 0.000004 - momentum: 0.000000
2023-11-16 00:59:17,561 epoch 1 - iter 6750/7500 - loss 1.14528322 - time (sec): 826.47 - samples/sec: 262.22 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:00:50,271 epoch 1 - iter 7500/7500 - loss 1.07465936 - time (sec): 919.18 - samples/sec: 261.97 - lr: 0.000005 - momentum: 0.000000
2023-11-16 01:00:50,273 ----------------------------------------------------------------------------------------------------
2023-11-16 01:00:50,273 EPOCH 1 done: loss 1.0747 - lr: 0.000005
2023-11-16 01:01:17,256 DEV : loss 0.2856157124042511 - f1-score (micro avg) 0.8045
2023-11-16 01:01:18,998 saving best model
2023-11-16 01:01:20,796 ----------------------------------------------------------------------------------------------------
2023-11-16 01:02:54,269 epoch 2 - iter 750/7500 - loss 0.40158514 - time (sec): 93.47 - samples/sec: 252.67 - lr: 0.000005 - momentum: 0.000000
2023-11-16 01:04:26,600 epoch 2 - iter 1500/7500 - loss 0.40578126 - time (sec): 185.80 - samples/sec: 257.91 - lr: 0.000005 - momentum: 0.000000
2023-11-16 01:06:01,665 epoch 2 - iter 2250/7500 - loss 0.40182467 - time (sec): 280.87 - samples/sec: 256.06 - lr: 0.000005 - momentum: 0.000000
2023-11-16 01:07:35,409 epoch 2 - iter 3000/7500 - loss 0.40425251 - time (sec): 374.61 - samples/sec: 255.46 - lr: 0.000005 - momentum: 0.000000
2023-11-16 01:09:06,889 epoch 2 - iter 3750/7500 - loss 0.40579040 - time (sec): 466.09 - samples/sec: 256.59 - lr: 0.000005 - momentum: 0.000000
2023-11-16 01:10:37,461 epoch 2 - iter 4500/7500 - loss 0.40296543 - time (sec): 556.66 - samples/sec: 257.85 - lr: 0.000005 - momentum: 0.000000
2023-11-16 01:12:10,834 epoch 2 - iter 5250/7500 - loss 0.39886416 - time (sec): 650.03 - samples/sec: 258.59 - lr: 0.000005 - momentum: 0.000000
2023-11-16 01:13:43,058 epoch 2 - iter 6000/7500 - loss 0.40163262 - time (sec): 742.26 - samples/sec: 258.99 - lr: 0.000005 - momentum: 0.000000
2023-11-16 01:15:15,768 epoch 2 - iter 6750/7500 - loss 0.39984468 - time (sec): 834.97 - samples/sec: 259.04 - lr: 0.000005 - momentum: 0.000000
2023-11-16 01:16:48,990 epoch 2 - iter 7500/7500 - loss 0.39673334 - time (sec): 928.19 - samples/sec: 259.43 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:16:48,992 ----------------------------------------------------------------------------------------------------
2023-11-16 01:16:48,992 EPOCH 2 done: loss 0.3967 - lr: 0.000004
2023-11-16 01:17:16,238 DEV : loss 0.27984410524368286 - f1-score (micro avg) 0.8635
2023-11-16 01:17:18,468 saving best model
2023-11-16 01:17:21,478 ----------------------------------------------------------------------------------------------------
2023-11-16 01:18:56,267 epoch 3 - iter 750/7500 - loss 0.34788380 - time (sec): 94.78 - samples/sec: 252.61 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:20:29,227 epoch 3 - iter 1500/7500 - loss 0.35790619 - time (sec): 187.74 - samples/sec: 257.86 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:22:00,989 epoch 3 - iter 2250/7500 - loss 0.36275974 - time (sec): 279.51 - samples/sec: 256.05 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:23:32,112 epoch 3 - iter 3000/7500 - loss 0.35448466 - time (sec): 370.63 - samples/sec: 257.61 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:25:04,514 epoch 3 - iter 3750/7500 - loss 0.35784162 - time (sec): 463.03 - samples/sec: 258.62 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:26:37,957 epoch 3 - iter 4500/7500 - loss 0.35491385 - time (sec): 556.48 - samples/sec: 259.13 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:28:09,894 epoch 3 - iter 5250/7500 - loss 0.35268653 - time (sec): 648.41 - samples/sec: 259.90 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:29:43,647 epoch 3 - iter 6000/7500 - loss 0.35432380 - time (sec): 742.17 - samples/sec: 259.26 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:31:15,583 epoch 3 - iter 6750/7500 - loss 0.35073442 - time (sec): 834.10 - samples/sec: 259.69 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:32:47,991 epoch 3 - iter 7500/7500 - loss 0.34845272 - time (sec): 926.51 - samples/sec: 259.90 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:32:47,993 ----------------------------------------------------------------------------------------------------
2023-11-16 01:32:47,993 EPOCH 3 done: loss 0.3485 - lr: 0.000004
2023-11-16 01:33:14,670 DEV : loss 0.2744104862213135 - f1-score (micro avg) 0.8834
2023-11-16 01:33:16,359 saving best model
2023-11-16 01:33:18,657 ----------------------------------------------------------------------------------------------------
2023-11-16 01:34:49,914 epoch 4 - iter 750/7500 - loss 0.29966012 - time (sec): 91.25 - samples/sec: 265.65 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:36:21,479 epoch 4 - iter 1500/7500 - loss 0.29512059 - time (sec): 182.82 - samples/sec: 262.70 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:37:53,047 epoch 4 - iter 2250/7500 - loss 0.29934339 - time (sec): 274.38 - samples/sec: 262.65 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:39:25,499 epoch 4 - iter 3000/7500 - loss 0.29881097 - time (sec): 366.84 - samples/sec: 263.27 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:40:56,808 epoch 4 - iter 3750/7500 - loss 0.29543460 - time (sec): 458.15 - samples/sec: 264.24 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:42:30,166 epoch 4 - iter 4500/7500 - loss 0.29266567 - time (sec): 551.50 - samples/sec: 262.68 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:44:03,529 epoch 4 - iter 5250/7500 - loss 0.29429266 - time (sec): 644.87 - samples/sec: 262.58 - lr: 0.000004 - momentum: 0.000000
2023-11-16 01:45:36,284 epoch 4 - iter 6000/7500 - loss 0.29201070 - time (sec): 737.62 - samples/sec: 261.99 - lr: 0.000003 - momentum: 0.000000
2023-11-16 01:47:11,437 epoch 4 - iter 6750/7500 - loss 0.29527772 - time (sec): 832.77 - samples/sec: 260.57 - lr: 0.000003 - momentum: 0.000000
2023-11-16 01:48:47,245 epoch 4 - iter 7500/7500 - loss 0.29729031 - time (sec): 928.58 - samples/sec: 259.32 - lr: 0.000003 - momentum: 0.000000
2023-11-16 01:48:47,248 ----------------------------------------------------------------------------------------------------
2023-11-16 01:48:47,248 EPOCH 4 done: loss 0.2973 - lr: 0.000003
2023-11-16 01:49:14,914 DEV : loss 0.3016064167022705 - f1-score (micro avg) 0.88
2023-11-16 01:49:16,910 ----------------------------------------------------------------------------------------------------
2023-11-16 01:50:49,496 epoch 5 - iter 750/7500 - loss 0.25377931 - time (sec): 92.58 - samples/sec: 262.37 - lr: 0.000003 - momentum: 0.000000
2023-11-16 01:52:20,605 epoch 5 - iter 1500/7500 - loss 0.25135538 - time (sec): 183.69 - samples/sec: 265.72 - lr: 0.000003 - momentum: 0.000000
2023-11-16 01:53:53,577 epoch 5 - iter 2250/7500 - loss 0.25580791 - time (sec): 276.66 - samples/sec: 262.18 - lr: 0.000003 - momentum: 0.000000
2023-11-16 01:55:26,799 epoch 5 - iter 3000/7500 - loss 0.25964203 - time (sec): 369.89 - samples/sec: 260.86 - lr: 0.000003 - momentum: 0.000000
2023-11-16 01:57:00,788 epoch 5 - iter 3750/7500 - loss 0.26132921 - time (sec): 463.87 - samples/sec: 259.14 - lr: 0.000003 - momentum: 0.000000
2023-11-16 01:58:33,972 epoch 5 - iter 4500/7500 - loss 0.26036055 - time (sec): 557.06 - samples/sec: 258.72 - lr: 0.000003 - momentum: 0.000000
2023-11-16 02:00:09,582 epoch 5 - iter 5250/7500 - loss 0.25878497 - time (sec): 652.67 - samples/sec: 257.81 - lr: 0.000003 - momentum: 0.000000
2023-11-16 02:01:48,359 epoch 5 - iter 6000/7500 - loss 0.25604978 - time (sec): 751.45 - samples/sec: 256.54 - lr: 0.000003 - momentum: 0.000000
2023-11-16 02:03:25,431 epoch 5 - iter 6750/7500 - loss 0.25651438 - time (sec): 848.52 - samples/sec: 255.47 - lr: 0.000003 - momentum: 0.000000
2023-11-16 02:04:59,400 epoch 5 - iter 7500/7500 - loss 0.25488422 - time (sec): 942.49 - samples/sec: 255.49 - lr: 0.000003 - momentum: 0.000000
2023-11-16 02:04:59,403 ----------------------------------------------------------------------------------------------------
2023-11-16 02:04:59,403 EPOCH 5 done: loss 0.2549 - lr: 0.000003
2023-11-16 02:05:26,452 DEV : loss 0.3108203411102295 - f1-score (micro avg) 0.8923
2023-11-16 02:05:28,547 saving best model
2023-11-16 02:05:31,087 ----------------------------------------------------------------------------------------------------
2023-11-16 02:07:03,915 epoch 6 - iter 750/7500 - loss 0.20941918 - time (sec): 92.82 - samples/sec: 258.36 - lr: 0.000003 - momentum: 0.000000
2023-11-16 02:08:34,259 epoch 6 - iter 1500/7500 - loss 0.20871668 - time (sec): 183.17 - samples/sec: 261.51 - lr: 0.000003 - momentum: 0.000000
2023-11-16 02:10:06,286 epoch 6 - iter 2250/7500 - loss 0.21719166 - time (sec): 275.20 - samples/sec: 261.07 - lr: 0.000003 - momentum: 0.000000
2023-11-16 02:11:38,611 epoch 6 - iter 3000/7500 - loss 0.22345226 - time (sec): 367.52 - samples/sec: 260.76 - lr: 0.000003 - momentum: 0.000000
2023-11-16 02:13:13,015 epoch 6 - iter 3750/7500 - loss 0.21948790 - time (sec): 461.92 - samples/sec: 260.22 - lr: 0.000003 - momentum: 0.000000
2023-11-16 02:14:48,651 epoch 6 - iter 4500/7500 - loss 0.22195698 - time (sec): 557.56 - samples/sec: 257.83 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:16:23,096 epoch 6 - iter 5250/7500 - loss 0.22241666 - time (sec): 652.01 - samples/sec: 257.73 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:17:57,154 epoch 6 - iter 6000/7500 - loss 0.22130923 - time (sec): 746.06 - samples/sec: 258.53 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:19:29,542 epoch 6 - iter 6750/7500 - loss 0.21994780 - time (sec): 838.45 - samples/sec: 258.60 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:21:01,165 epoch 6 - iter 7500/7500 - loss 0.21770578 - time (sec): 930.07 - samples/sec: 258.90 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:21:01,168 ----------------------------------------------------------------------------------------------------
2023-11-16 02:21:01,168 EPOCH 6 done: loss 0.2177 - lr: 0.000002
2023-11-16 02:21:28,850 DEV : loss 0.31180956959724426 - f1-score (micro avg) 0.8955
2023-11-16 02:21:31,381 saving best model
2023-11-16 02:21:34,381 ----------------------------------------------------------------------------------------------------
2023-11-16 02:23:08,670 epoch 7 - iter 750/7500 - loss 0.17025570 - time (sec): 94.28 - samples/sec: 253.94 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:24:44,294 epoch 7 - iter 1500/7500 - loss 0.18032455 - time (sec): 189.91 - samples/sec: 253.38 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:26:19,207 epoch 7 - iter 2250/7500 - loss 0.18368583 - time (sec): 284.82 - samples/sec: 253.98 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:27:52,445 epoch 7 - iter 3000/7500 - loss 0.18638293 - time (sec): 378.06 - samples/sec: 254.49 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:29:25,782 epoch 7 - iter 3750/7500 - loss 0.18144838 - time (sec): 471.40 - samples/sec: 255.30 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:30:59,415 epoch 7 - iter 4500/7500 - loss 0.18697815 - time (sec): 565.03 - samples/sec: 255.93 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:32:31,970 epoch 7 - iter 5250/7500 - loss 0.18690520 - time (sec): 657.58 - samples/sec: 256.63 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:34:05,459 epoch 7 - iter 6000/7500 - loss 0.18389577 - time (sec): 751.07 - samples/sec: 256.38 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:35:40,741 epoch 7 - iter 6750/7500 - loss 0.18345948 - time (sec): 846.36 - samples/sec: 255.76 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:37:17,970 epoch 7 - iter 7500/7500 - loss 0.18350743 - time (sec): 943.58 - samples/sec: 255.19 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:37:17,972 ----------------------------------------------------------------------------------------------------
2023-11-16 02:37:17,972 EPOCH 7 done: loss 0.1835 - lr: 0.000002
2023-11-16 02:37:45,548 DEV : loss 0.31052064895629883 - f1-score (micro avg) 0.901
2023-11-16 02:37:47,535 saving best model
2023-11-16 02:37:49,958 ----------------------------------------------------------------------------------------------------
2023-11-16 02:39:25,797 epoch 8 - iter 750/7500 - loss 0.14917987 - time (sec): 95.84 - samples/sec: 255.82 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:40:58,825 epoch 8 - iter 1500/7500 - loss 0.16554104 - time (sec): 188.86 - samples/sec: 254.74 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:42:33,026 epoch 8 - iter 2250/7500 - loss 0.16246413 - time (sec): 283.06 - samples/sec: 254.01 - lr: 0.000002 - momentum: 0.000000
2023-11-16 02:44:05,013 epoch 8 - iter 3000/7500 - loss 0.15793136 - time (sec): 375.05 - samples/sec: 254.92 - lr: 0.000001 - momentum: 0.000000
2023-11-16 02:45:37,970 epoch 8 - iter 3750/7500 - loss 0.15705842 - time (sec): 468.01 - samples/sec: 255.77 - lr: 0.000001 - momentum: 0.000000
2023-11-16 02:47:09,872 epoch 8 - iter 4500/7500 - loss 0.15757577 - time (sec): 559.91 - samples/sec: 256.37 - lr: 0.000001 - momentum: 0.000000
2023-11-16 02:48:43,951 epoch 8 - iter 5250/7500 - loss 0.15530409 - time (sec): 653.99 - samples/sec: 256.53 - lr: 0.000001 - momentum: 0.000000
2023-11-16 02:50:15,610 epoch 8 - iter 6000/7500 - loss 0.15633332 - time (sec): 745.65 - samples/sec: 258.01 - lr: 0.000001 - momentum: 0.000000
2023-11-16 02:51:49,707 epoch 8 - iter 6750/7500 - loss 0.15781340 - time (sec): 839.75 - samples/sec: 258.51 - lr: 0.000001 - momentum: 0.000000
2023-11-16 02:53:22,758 epoch 8 - iter 7500/7500 - loss 0.15738201 - time (sec): 932.80 - samples/sec: 258.14 - lr: 0.000001 - momentum: 0.000000
2023-11-16 02:53:22,761 ----------------------------------------------------------------------------------------------------
2023-11-16 02:53:22,761 EPOCH 8 done: loss 0.1574 - lr: 0.000001
2023-11-16 02:53:49,642 DEV : loss 0.31349387764930725 - f1-score (micro avg) 0.9012
2023-11-16 02:53:51,537 saving best model
2023-11-16 02:53:53,925 ----------------------------------------------------------------------------------------------------
2023-11-16 02:55:30,620 epoch 9 - iter 750/7500 - loss 0.12454614 - time (sec): 96.69 - samples/sec: 248.06 - lr: 0.000001 - momentum: 0.000000
2023-11-16 02:57:02,449 epoch 9 - iter 1500/7500 - loss 0.12492215 - time (sec): 188.52 - samples/sec: 254.80 - lr: 0.000001 - momentum: 0.000000
2023-11-16 02:58:34,812 epoch 9 - iter 2250/7500 - loss 0.13002767 - time (sec): 280.88 - samples/sec: 257.65 - lr: 0.000001 - momentum: 0.000000
2023-11-16 03:00:06,592 epoch 9 - iter 3000/7500 - loss 0.12936109 - time (sec): 372.66 - samples/sec: 259.06 - lr: 0.000001 - momentum: 0.000000
2023-11-16 03:01:36,302 epoch 9 - iter 3750/7500 - loss 0.12949282 - time (sec): 462.37 - samples/sec: 260.59 - lr: 0.000001 - momentum: 0.000000
2023-11-16 03:03:09,938 epoch 9 - iter 4500/7500 - loss 0.13100535 - time (sec): 556.01 - samples/sec: 259.28 - lr: 0.000001 - momentum: 0.000000
2023-11-16 03:04:42,142 epoch 9 - iter 5250/7500 - loss 0.13241958 - time (sec): 648.21 - samples/sec: 259.22 - lr: 0.000001 - momentum: 0.000000
2023-11-16 03:06:15,744 epoch 9 - iter 6000/7500 - loss 0.13197722 - time (sec): 741.81 - samples/sec: 260.12 - lr: 0.000001 - momentum: 0.000000
2023-11-16 03:07:52,365 epoch 9 - iter 6750/7500 - loss 0.13101789 - time (sec): 838.44 - samples/sec: 258.40 - lr: 0.000001 - momentum: 0.000000
2023-11-16 03:09:30,882 epoch 9 - iter 7500/7500 - loss 0.13192127 - time (sec): 936.95 - samples/sec: 257.00 - lr: 0.000001 - momentum: 0.000000
2023-11-16 03:09:30,885 ----------------------------------------------------------------------------------------------------
2023-11-16 03:09:30,886 EPOCH 9 done: loss 0.1319 - lr: 0.000001
2023-11-16 03:09:58,661 DEV : loss 0.3276961147785187 - f1-score (micro avg) 0.9002
2023-11-16 03:10:01,082 ----------------------------------------------------------------------------------------------------
2023-11-16 03:11:36,851 epoch 10 - iter 750/7500 - loss 0.10811317 - time (sec): 95.77 - samples/sec: 255.65 - lr: 0.000001 - momentum: 0.000000
2023-11-16 03:13:11,188 epoch 10 - iter 1500/7500 - loss 0.10879497 - time (sec): 190.10 - samples/sec: 251.97 - lr: 0.000000 - momentum: 0.000000
2023-11-16 03:14:44,600 epoch 10 - iter 2250/7500 - loss 0.11179486 - time (sec): 283.52 - samples/sec: 253.42 - lr: 0.000000 - momentum: 0.000000
2023-11-16 03:16:16,891 epoch 10 - iter 3000/7500 - loss 0.11110255 - time (sec): 375.81 - samples/sec: 256.07 - lr: 0.000000 - momentum: 0.000000
2023-11-16 03:17:51,280 epoch 10 - iter 3750/7500 - loss 0.11617599 - time (sec): 470.20 - samples/sec: 254.90 - lr: 0.000000 - momentum: 0.000000
2023-11-16 03:19:22,531 epoch 10 - iter 4500/7500 - loss 0.11661813 - time (sec): 561.45 - samples/sec: 256.54 - lr: 0.000000 - momentum: 0.000000
2023-11-16 03:20:53,037 epoch 10 - iter 5250/7500 - loss 0.11803804 - time (sec): 651.95 - samples/sec: 257.89 - lr: 0.000000 - momentum: 0.000000
2023-11-16 03:22:24,015 epoch 10 - iter 6000/7500 - loss 0.11722958 - time (sec): 742.93 - samples/sec: 258.60 - lr: 0.000000 - momentum: 0.000000
2023-11-16 03:23:55,236 epoch 10 - iter 6750/7500 - loss 0.11710786 - time (sec): 834.15 - samples/sec: 260.07 - lr: 0.000000 - momentum: 0.000000
2023-11-16 03:25:24,728 epoch 10 - iter 7500/7500 - loss 0.11716487 - time (sec): 923.64 - samples/sec: 260.70 - lr: 0.000000 - momentum: 0.000000
2023-11-16 03:25:24,731 ----------------------------------------------------------------------------------------------------
2023-11-16 03:25:24,731 EPOCH 10 done: loss 0.1172 - lr: 0.000000
2023-11-16 03:25:51,758 DEV : loss 0.32983964681625366 - f1-score (micro avg) 0.9006
2023-11-16 03:25:55,590 ----------------------------------------------------------------------------------------------------
2023-11-16 03:25:55,592 Loading model from best epoch ...
2023-11-16 03:26:03,736 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER
2023-11-16 03:26:31,850
Results:
- F-score (micro) 0.9027
- F-score (macro) 0.9014
- Accuracy 0.8521
By class:
precision recall f1-score support
LOC 0.9036 0.9141 0.9088 5288
PER 0.9238 0.9427 0.9332 3962
ORG 0.8593 0.8650 0.8622 3807
micro avg 0.8969 0.9085 0.9027 13057
macro avg 0.8956 0.9073 0.9014 13057
weighted avg 0.8968 0.9085 0.9026 13057
2023-11-16 03:26:31,850 ----------------------------------------------------------------------------------------------------