stefan-it's picture
Upload ./training.log with huggingface_hub
336a67d
2023-10-23 18:01:56,366 ----------------------------------------------------------------------------------------------------
2023-10-23 18:01:56,368 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=25, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-23 18:01:56,368 ----------------------------------------------------------------------------------------------------
2023-10-23 18:01:56,368 MultiCorpus: 1214 train + 266 dev + 251 test sentences
- NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator
2023-10-23 18:01:56,368 ----------------------------------------------------------------------------------------------------
2023-10-23 18:01:56,368 Train: 1214 sentences
2023-10-23 18:01:56,368 (train_with_dev=False, train_with_test=False)
2023-10-23 18:01:56,368 ----------------------------------------------------------------------------------------------------
2023-10-23 18:01:56,368 Training Params:
2023-10-23 18:01:56,368 - learning_rate: "3e-05"
2023-10-23 18:01:56,368 - mini_batch_size: "4"
2023-10-23 18:01:56,368 - max_epochs: "10"
2023-10-23 18:01:56,368 - shuffle: "True"
2023-10-23 18:01:56,368 ----------------------------------------------------------------------------------------------------
2023-10-23 18:01:56,368 Plugins:
2023-10-23 18:01:56,369 - TensorboardLogger
2023-10-23 18:01:56,369 - LinearScheduler | warmup_fraction: '0.1'
2023-10-23 18:01:56,369 ----------------------------------------------------------------------------------------------------
2023-10-23 18:01:56,369 Final evaluation on model from best epoch (best-model.pt)
2023-10-23 18:01:56,369 - metric: "('micro avg', 'f1-score')"
2023-10-23 18:01:56,369 ----------------------------------------------------------------------------------------------------
2023-10-23 18:01:56,369 Computation:
2023-10-23 18:01:56,369 - compute on device: cuda:0
2023-10-23 18:01:56,369 - embedding storage: none
2023-10-23 18:01:56,369 ----------------------------------------------------------------------------------------------------
2023-10-23 18:01:56,369 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-23 18:01:56,369 ----------------------------------------------------------------------------------------------------
2023-10-23 18:01:56,369 ----------------------------------------------------------------------------------------------------
2023-10-23 18:01:56,369 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-23 18:01:57,733 epoch 1 - iter 30/304 - loss 3.22859122 - time (sec): 1.36 - samples/sec: 2400.04 - lr: 0.000003 - momentum: 0.000000
2023-10-23 18:01:59,126 epoch 1 - iter 60/304 - loss 2.50704888 - time (sec): 2.76 - samples/sec: 2214.52 - lr: 0.000006 - momentum: 0.000000
2023-10-23 18:02:00,751 epoch 1 - iter 90/304 - loss 1.89908248 - time (sec): 4.38 - samples/sec: 2108.92 - lr: 0.000009 - momentum: 0.000000
2023-10-23 18:02:02,385 epoch 1 - iter 120/304 - loss 1.51110590 - time (sec): 6.01 - samples/sec: 2116.56 - lr: 0.000012 - momentum: 0.000000
2023-10-23 18:02:04,015 epoch 1 - iter 150/304 - loss 1.29649003 - time (sec): 7.64 - samples/sec: 2052.14 - lr: 0.000015 - momentum: 0.000000
2023-10-23 18:02:05,578 epoch 1 - iter 180/304 - loss 1.14647134 - time (sec): 9.21 - samples/sec: 2026.89 - lr: 0.000018 - momentum: 0.000000
2023-10-23 18:02:06,982 epoch 1 - iter 210/304 - loss 1.02870170 - time (sec): 10.61 - samples/sec: 2036.76 - lr: 0.000021 - momentum: 0.000000
2023-10-23 18:02:08,620 epoch 1 - iter 240/304 - loss 0.92898649 - time (sec): 12.25 - samples/sec: 2000.77 - lr: 0.000024 - momentum: 0.000000
2023-10-23 18:02:10,251 epoch 1 - iter 270/304 - loss 0.85743135 - time (sec): 13.88 - samples/sec: 1977.16 - lr: 0.000027 - momentum: 0.000000
2023-10-23 18:02:11,888 epoch 1 - iter 300/304 - loss 0.78882405 - time (sec): 15.52 - samples/sec: 1974.88 - lr: 0.000030 - momentum: 0.000000
2023-10-23 18:02:12,104 ----------------------------------------------------------------------------------------------------
2023-10-23 18:02:12,104 EPOCH 1 done: loss 0.7834 - lr: 0.000030
2023-10-23 18:02:12,891 DEV : loss 0.16494107246398926 - f1-score (micro avg) 0.7329
2023-10-23 18:02:12,898 saving best model
2023-10-23 18:02:13,283 ----------------------------------------------------------------------------------------------------
2023-10-23 18:02:14,905 epoch 2 - iter 30/304 - loss 0.15164947 - time (sec): 1.62 - samples/sec: 1990.85 - lr: 0.000030 - momentum: 0.000000
2023-10-23 18:02:16,531 epoch 2 - iter 60/304 - loss 0.13202016 - time (sec): 3.25 - samples/sec: 1884.23 - lr: 0.000029 - momentum: 0.000000
2023-10-23 18:02:18,158 epoch 2 - iter 90/304 - loss 0.12360319 - time (sec): 4.87 - samples/sec: 1872.97 - lr: 0.000029 - momentum: 0.000000
2023-10-23 18:02:19,798 epoch 2 - iter 120/304 - loss 0.11741922 - time (sec): 6.51 - samples/sec: 1886.14 - lr: 0.000029 - momentum: 0.000000
2023-10-23 18:02:21,428 epoch 2 - iter 150/304 - loss 0.11472373 - time (sec): 8.14 - samples/sec: 1894.28 - lr: 0.000028 - momentum: 0.000000
2023-10-23 18:02:23,062 epoch 2 - iter 180/304 - loss 0.11310547 - time (sec): 9.78 - samples/sec: 1877.28 - lr: 0.000028 - momentum: 0.000000
2023-10-23 18:02:24,700 epoch 2 - iter 210/304 - loss 0.11854948 - time (sec): 11.42 - samples/sec: 1870.64 - lr: 0.000028 - momentum: 0.000000
2023-10-23 18:02:26,341 epoch 2 - iter 240/304 - loss 0.11943118 - time (sec): 13.06 - samples/sec: 1877.91 - lr: 0.000027 - momentum: 0.000000
2023-10-23 18:02:27,975 epoch 2 - iter 270/304 - loss 0.11686856 - time (sec): 14.69 - samples/sec: 1875.50 - lr: 0.000027 - momentum: 0.000000
2023-10-23 18:02:29,604 epoch 2 - iter 300/304 - loss 0.11705388 - time (sec): 16.32 - samples/sec: 1878.10 - lr: 0.000027 - momentum: 0.000000
2023-10-23 18:02:29,818 ----------------------------------------------------------------------------------------------------
2023-10-23 18:02:29,819 EPOCH 2 done: loss 0.1185 - lr: 0.000027
2023-10-23 18:02:30,694 DEV : loss 0.16803380846977234 - f1-score (micro avg) 0.7584
2023-10-23 18:02:30,701 saving best model
2023-10-23 18:02:31,223 ----------------------------------------------------------------------------------------------------
2023-10-23 18:02:32,811 epoch 3 - iter 30/304 - loss 0.07967270 - time (sec): 1.59 - samples/sec: 1921.60 - lr: 0.000026 - momentum: 0.000000
2023-10-23 18:02:34,348 epoch 3 - iter 60/304 - loss 0.06297211 - time (sec): 3.12 - samples/sec: 1900.93 - lr: 0.000026 - momentum: 0.000000
2023-10-23 18:02:35,882 epoch 3 - iter 90/304 - loss 0.06956546 - time (sec): 4.66 - samples/sec: 1922.63 - lr: 0.000026 - momentum: 0.000000
2023-10-23 18:02:37,403 epoch 3 - iter 120/304 - loss 0.06079199 - time (sec): 6.18 - samples/sec: 1947.62 - lr: 0.000025 - momentum: 0.000000
2023-10-23 18:02:38,935 epoch 3 - iter 150/304 - loss 0.06046460 - time (sec): 7.71 - samples/sec: 1945.46 - lr: 0.000025 - momentum: 0.000000
2023-10-23 18:02:40,475 epoch 3 - iter 180/304 - loss 0.06907627 - time (sec): 9.25 - samples/sec: 1968.52 - lr: 0.000025 - momentum: 0.000000
2023-10-23 18:02:42,002 epoch 3 - iter 210/304 - loss 0.06382555 - time (sec): 10.78 - samples/sec: 1998.71 - lr: 0.000024 - momentum: 0.000000
2023-10-23 18:02:43,516 epoch 3 - iter 240/304 - loss 0.06992834 - time (sec): 12.29 - samples/sec: 1999.91 - lr: 0.000024 - momentum: 0.000000
2023-10-23 18:02:45,059 epoch 3 - iter 270/304 - loss 0.07019082 - time (sec): 13.84 - samples/sec: 2007.13 - lr: 0.000024 - momentum: 0.000000
2023-10-23 18:02:46,586 epoch 3 - iter 300/304 - loss 0.07399141 - time (sec): 15.36 - samples/sec: 1997.92 - lr: 0.000023 - momentum: 0.000000
2023-10-23 18:02:46,785 ----------------------------------------------------------------------------------------------------
2023-10-23 18:02:46,785 EPOCH 3 done: loss 0.0734 - lr: 0.000023
2023-10-23 18:02:47,628 DEV : loss 0.15691877901554108 - f1-score (micro avg) 0.8208
2023-10-23 18:02:47,635 saving best model
2023-10-23 18:02:48,150 ----------------------------------------------------------------------------------------------------
2023-10-23 18:02:49,684 epoch 4 - iter 30/304 - loss 0.05174882 - time (sec): 1.53 - samples/sec: 1772.73 - lr: 0.000023 - momentum: 0.000000
2023-10-23 18:02:51,221 epoch 4 - iter 60/304 - loss 0.04709752 - time (sec): 3.07 - samples/sec: 1951.60 - lr: 0.000023 - momentum: 0.000000
2023-10-23 18:02:52,746 epoch 4 - iter 90/304 - loss 0.04508234 - time (sec): 4.59 - samples/sec: 1968.03 - lr: 0.000022 - momentum: 0.000000
2023-10-23 18:02:54,242 epoch 4 - iter 120/304 - loss 0.04271894 - time (sec): 6.09 - samples/sec: 1980.96 - lr: 0.000022 - momentum: 0.000000
2023-10-23 18:02:55,780 epoch 4 - iter 150/304 - loss 0.04832315 - time (sec): 7.63 - samples/sec: 2015.51 - lr: 0.000022 - momentum: 0.000000
2023-10-23 18:02:57,311 epoch 4 - iter 180/304 - loss 0.05250395 - time (sec): 9.16 - samples/sec: 1991.88 - lr: 0.000021 - momentum: 0.000000
2023-10-23 18:02:58,856 epoch 4 - iter 210/304 - loss 0.05605964 - time (sec): 10.70 - samples/sec: 2019.52 - lr: 0.000021 - momentum: 0.000000
2023-10-23 18:03:00,372 epoch 4 - iter 240/304 - loss 0.05140479 - time (sec): 12.22 - samples/sec: 2034.28 - lr: 0.000021 - momentum: 0.000000
2023-10-23 18:03:01,893 epoch 4 - iter 270/304 - loss 0.04958925 - time (sec): 13.74 - samples/sec: 2009.63 - lr: 0.000020 - momentum: 0.000000
2023-10-23 18:03:03,357 epoch 4 - iter 300/304 - loss 0.04728159 - time (sec): 15.21 - samples/sec: 2015.15 - lr: 0.000020 - momentum: 0.000000
2023-10-23 18:03:03,559 ----------------------------------------------------------------------------------------------------
2023-10-23 18:03:03,559 EPOCH 4 done: loss 0.0478 - lr: 0.000020
2023-10-23 18:03:04,391 DEV : loss 0.19513197243213654 - f1-score (micro avg) 0.8213
2023-10-23 18:03:04,398 saving best model
2023-10-23 18:03:04,924 ----------------------------------------------------------------------------------------------------
2023-10-23 18:03:06,450 epoch 5 - iter 30/304 - loss 0.07635624 - time (sec): 1.52 - samples/sec: 1831.95 - lr: 0.000020 - momentum: 0.000000
2023-10-23 18:03:07,958 epoch 5 - iter 60/304 - loss 0.05132725 - time (sec): 3.03 - samples/sec: 1928.72 - lr: 0.000019 - momentum: 0.000000
2023-10-23 18:03:09,489 epoch 5 - iter 90/304 - loss 0.04090643 - time (sec): 4.56 - samples/sec: 1965.49 - lr: 0.000019 - momentum: 0.000000
2023-10-23 18:03:11,031 epoch 5 - iter 120/304 - loss 0.04857162 - time (sec): 6.11 - samples/sec: 1972.13 - lr: 0.000019 - momentum: 0.000000
2023-10-23 18:03:12,558 epoch 5 - iter 150/304 - loss 0.04397227 - time (sec): 7.63 - samples/sec: 1997.32 - lr: 0.000018 - momentum: 0.000000
2023-10-23 18:03:14,091 epoch 5 - iter 180/304 - loss 0.04865740 - time (sec): 9.16 - samples/sec: 2011.86 - lr: 0.000018 - momentum: 0.000000
2023-10-23 18:03:15,615 epoch 5 - iter 210/304 - loss 0.04692325 - time (sec): 10.69 - samples/sec: 2027.76 - lr: 0.000018 - momentum: 0.000000
2023-10-23 18:03:17,149 epoch 5 - iter 240/304 - loss 0.04775950 - time (sec): 12.22 - samples/sec: 2023.10 - lr: 0.000017 - momentum: 0.000000
2023-10-23 18:03:18,691 epoch 5 - iter 270/304 - loss 0.04856867 - time (sec): 13.76 - samples/sec: 2024.70 - lr: 0.000017 - momentum: 0.000000
2023-10-23 18:03:20,227 epoch 5 - iter 300/304 - loss 0.04629497 - time (sec): 15.30 - samples/sec: 2001.90 - lr: 0.000017 - momentum: 0.000000
2023-10-23 18:03:20,426 ----------------------------------------------------------------------------------------------------
2023-10-23 18:03:20,426 EPOCH 5 done: loss 0.0460 - lr: 0.000017
2023-10-23 18:03:21,279 DEV : loss 0.20984667539596558 - f1-score (micro avg) 0.828
2023-10-23 18:03:21,286 saving best model
2023-10-23 18:03:21,793 ----------------------------------------------------------------------------------------------------
2023-10-23 18:03:23,327 epoch 6 - iter 30/304 - loss 0.05838661 - time (sec): 1.53 - samples/sec: 2011.13 - lr: 0.000016 - momentum: 0.000000
2023-10-23 18:03:24,857 epoch 6 - iter 60/304 - loss 0.03064926 - time (sec): 3.06 - samples/sec: 2018.03 - lr: 0.000016 - momentum: 0.000000
2023-10-23 18:03:26,386 epoch 6 - iter 90/304 - loss 0.03023633 - time (sec): 4.59 - samples/sec: 1970.53 - lr: 0.000016 - momentum: 0.000000
2023-10-23 18:03:27,934 epoch 6 - iter 120/304 - loss 0.02590112 - time (sec): 6.14 - samples/sec: 1978.96 - lr: 0.000015 - momentum: 0.000000
2023-10-23 18:03:29,472 epoch 6 - iter 150/304 - loss 0.03041437 - time (sec): 7.68 - samples/sec: 1999.47 - lr: 0.000015 - momentum: 0.000000
2023-10-23 18:03:31,003 epoch 6 - iter 180/304 - loss 0.02672928 - time (sec): 9.21 - samples/sec: 1969.57 - lr: 0.000015 - momentum: 0.000000
2023-10-23 18:03:32,548 epoch 6 - iter 210/304 - loss 0.02834806 - time (sec): 10.75 - samples/sec: 1969.87 - lr: 0.000014 - momentum: 0.000000
2023-10-23 18:03:34,074 epoch 6 - iter 240/304 - loss 0.02613951 - time (sec): 12.28 - samples/sec: 1971.87 - lr: 0.000014 - momentum: 0.000000
2023-10-23 18:03:35,769 epoch 6 - iter 270/304 - loss 0.02354283 - time (sec): 13.97 - samples/sec: 1947.89 - lr: 0.000014 - momentum: 0.000000
2023-10-23 18:03:37,313 epoch 6 - iter 300/304 - loss 0.02592855 - time (sec): 15.52 - samples/sec: 1977.22 - lr: 0.000013 - momentum: 0.000000
2023-10-23 18:03:37,514 ----------------------------------------------------------------------------------------------------
2023-10-23 18:03:37,514 EPOCH 6 done: loss 0.0257 - lr: 0.000013
2023-10-23 18:03:38,348 DEV : loss 0.2157229781150818 - f1-score (micro avg) 0.8479
2023-10-23 18:03:38,354 saving best model
2023-10-23 18:03:38,815 ----------------------------------------------------------------------------------------------------
2023-10-23 18:03:40,335 epoch 7 - iter 30/304 - loss 0.01535834 - time (sec): 1.52 - samples/sec: 1984.01 - lr: 0.000013 - momentum: 0.000000
2023-10-23 18:03:41,868 epoch 7 - iter 60/304 - loss 0.01430276 - time (sec): 3.05 - samples/sec: 2024.85 - lr: 0.000013 - momentum: 0.000000
2023-10-23 18:03:43,395 epoch 7 - iter 90/304 - loss 0.01880393 - time (sec): 4.58 - samples/sec: 2005.58 - lr: 0.000012 - momentum: 0.000000
2023-10-23 18:03:44,930 epoch 7 - iter 120/304 - loss 0.02169380 - time (sec): 6.11 - samples/sec: 2016.59 - lr: 0.000012 - momentum: 0.000000
2023-10-23 18:03:46,472 epoch 7 - iter 150/304 - loss 0.02009922 - time (sec): 7.66 - samples/sec: 2024.75 - lr: 0.000012 - momentum: 0.000000
2023-10-23 18:03:48,001 epoch 7 - iter 180/304 - loss 0.01942850 - time (sec): 9.19 - samples/sec: 2017.01 - lr: 0.000011 - momentum: 0.000000
2023-10-23 18:03:49,538 epoch 7 - iter 210/304 - loss 0.01929308 - time (sec): 10.72 - samples/sec: 2018.16 - lr: 0.000011 - momentum: 0.000000
2023-10-23 18:03:51,083 epoch 7 - iter 240/304 - loss 0.01755702 - time (sec): 12.27 - samples/sec: 2015.29 - lr: 0.000011 - momentum: 0.000000
2023-10-23 18:03:52,612 epoch 7 - iter 270/304 - loss 0.01901656 - time (sec): 13.80 - samples/sec: 1993.96 - lr: 0.000010 - momentum: 0.000000
2023-10-23 18:03:54,140 epoch 7 - iter 300/304 - loss 0.01857959 - time (sec): 15.32 - samples/sec: 1998.91 - lr: 0.000010 - momentum: 0.000000
2023-10-23 18:03:54,341 ----------------------------------------------------------------------------------------------------
2023-10-23 18:03:54,341 EPOCH 7 done: loss 0.0184 - lr: 0.000010
2023-10-23 18:03:55,223 DEV : loss 0.21933875977993011 - f1-score (micro avg) 0.8527
2023-10-23 18:03:55,230 saving best model
2023-10-23 18:03:55,738 ----------------------------------------------------------------------------------------------------
2023-10-23 18:03:57,273 epoch 8 - iter 30/304 - loss 0.02600286 - time (sec): 1.53 - samples/sec: 1926.23 - lr: 0.000010 - momentum: 0.000000
2023-10-23 18:03:58,805 epoch 8 - iter 60/304 - loss 0.02265957 - time (sec): 3.06 - samples/sec: 1866.21 - lr: 0.000009 - momentum: 0.000000
2023-10-23 18:04:00,338 epoch 8 - iter 90/304 - loss 0.01568299 - time (sec): 4.60 - samples/sec: 1866.53 - lr: 0.000009 - momentum: 0.000000
2023-10-23 18:04:01,884 epoch 8 - iter 120/304 - loss 0.01215458 - time (sec): 6.14 - samples/sec: 1911.08 - lr: 0.000009 - momentum: 0.000000
2023-10-23 18:04:03,421 epoch 8 - iter 150/304 - loss 0.01039071 - time (sec): 7.68 - samples/sec: 1965.96 - lr: 0.000008 - momentum: 0.000000
2023-10-23 18:04:04,957 epoch 8 - iter 180/304 - loss 0.01085885 - time (sec): 9.22 - samples/sec: 1960.59 - lr: 0.000008 - momentum: 0.000000
2023-10-23 18:04:06,484 epoch 8 - iter 210/304 - loss 0.01312501 - time (sec): 10.74 - samples/sec: 1959.40 - lr: 0.000008 - momentum: 0.000000
2023-10-23 18:04:08,015 epoch 8 - iter 240/304 - loss 0.01427002 - time (sec): 12.27 - samples/sec: 1962.25 - lr: 0.000007 - momentum: 0.000000
2023-10-23 18:04:09,549 epoch 8 - iter 270/304 - loss 0.01310105 - time (sec): 13.81 - samples/sec: 1981.07 - lr: 0.000007 - momentum: 0.000000
2023-10-23 18:04:11,090 epoch 8 - iter 300/304 - loss 0.01335148 - time (sec): 15.35 - samples/sec: 1994.69 - lr: 0.000007 - momentum: 0.000000
2023-10-23 18:04:11,291 ----------------------------------------------------------------------------------------------------
2023-10-23 18:04:11,291 EPOCH 8 done: loss 0.0132 - lr: 0.000007
2023-10-23 18:04:12,173 DEV : loss 0.21935363113880157 - f1-score (micro avg) 0.8592
2023-10-23 18:04:12,181 saving best model
2023-10-23 18:04:12,696 ----------------------------------------------------------------------------------------------------
2023-10-23 18:04:14,281 epoch 9 - iter 30/304 - loss 0.00558858 - time (sec): 1.58 - samples/sec: 1910.45 - lr: 0.000006 - momentum: 0.000000
2023-10-23 18:04:15,901 epoch 9 - iter 60/304 - loss 0.00323547 - time (sec): 3.20 - samples/sec: 1862.40 - lr: 0.000006 - momentum: 0.000000
2023-10-23 18:04:17,434 epoch 9 - iter 90/304 - loss 0.00723733 - time (sec): 4.73 - samples/sec: 1902.22 - lr: 0.000006 - momentum: 0.000000
2023-10-23 18:04:19,065 epoch 9 - iter 120/304 - loss 0.00777477 - time (sec): 6.37 - samples/sec: 1913.02 - lr: 0.000005 - momentum: 0.000000
2023-10-23 18:04:20,692 epoch 9 - iter 150/304 - loss 0.00973811 - time (sec): 7.99 - samples/sec: 1885.69 - lr: 0.000005 - momentum: 0.000000
2023-10-23 18:04:22,324 epoch 9 - iter 180/304 - loss 0.01101039 - time (sec): 9.62 - samples/sec: 1899.76 - lr: 0.000005 - momentum: 0.000000
2023-10-23 18:04:23,950 epoch 9 - iter 210/304 - loss 0.00950776 - time (sec): 11.25 - samples/sec: 1907.10 - lr: 0.000004 - momentum: 0.000000
2023-10-23 18:04:25,518 epoch 9 - iter 240/304 - loss 0.01071653 - time (sec): 12.82 - samples/sec: 1894.00 - lr: 0.000004 - momentum: 0.000000
2023-10-23 18:04:27,139 epoch 9 - iter 270/304 - loss 0.01041251 - time (sec): 14.44 - samples/sec: 1895.49 - lr: 0.000004 - momentum: 0.000000
2023-10-23 18:04:28,777 epoch 9 - iter 300/304 - loss 0.00988370 - time (sec): 16.08 - samples/sec: 1909.26 - lr: 0.000003 - momentum: 0.000000
2023-10-23 18:04:28,993 ----------------------------------------------------------------------------------------------------
2023-10-23 18:04:28,993 EPOCH 9 done: loss 0.0098 - lr: 0.000003
2023-10-23 18:04:29,869 DEV : loss 0.22314132750034332 - f1-score (micro avg) 0.8551
2023-10-23 18:04:29,876 ----------------------------------------------------------------------------------------------------
2023-10-23 18:04:31,510 epoch 10 - iter 30/304 - loss 0.00816345 - time (sec): 1.63 - samples/sec: 1922.50 - lr: 0.000003 - momentum: 0.000000
2023-10-23 18:04:33,128 epoch 10 - iter 60/304 - loss 0.00959374 - time (sec): 3.25 - samples/sec: 1868.76 - lr: 0.000003 - momentum: 0.000000
2023-10-23 18:04:34,744 epoch 10 - iter 90/304 - loss 0.00655196 - time (sec): 4.87 - samples/sec: 1832.29 - lr: 0.000002 - momentum: 0.000000
2023-10-23 18:04:36,364 epoch 10 - iter 120/304 - loss 0.00669931 - time (sec): 6.49 - samples/sec: 1862.12 - lr: 0.000002 - momentum: 0.000000
2023-10-23 18:04:37,742 epoch 10 - iter 150/304 - loss 0.00636619 - time (sec): 7.86 - samples/sec: 1901.79 - lr: 0.000002 - momentum: 0.000000
2023-10-23 18:04:39,089 epoch 10 - iter 180/304 - loss 0.00549301 - time (sec): 9.21 - samples/sec: 1979.22 - lr: 0.000001 - momentum: 0.000000
2023-10-23 18:04:40,628 epoch 10 - iter 210/304 - loss 0.00649187 - time (sec): 10.75 - samples/sec: 2002.25 - lr: 0.000001 - momentum: 0.000000
2023-10-23 18:04:42,260 epoch 10 - iter 240/304 - loss 0.00698066 - time (sec): 12.38 - samples/sec: 1998.51 - lr: 0.000001 - momentum: 0.000000
2023-10-23 18:04:43,890 epoch 10 - iter 270/304 - loss 0.00649229 - time (sec): 14.01 - samples/sec: 1983.64 - lr: 0.000000 - momentum: 0.000000
2023-10-23 18:04:45,499 epoch 10 - iter 300/304 - loss 0.00695700 - time (sec): 15.62 - samples/sec: 1966.21 - lr: 0.000000 - momentum: 0.000000
2023-10-23 18:04:45,710 ----------------------------------------------------------------------------------------------------
2023-10-23 18:04:45,710 EPOCH 10 done: loss 0.0069 - lr: 0.000000
2023-10-23 18:04:46,587 DEV : loss 0.22776642441749573 - f1-score (micro avg) 0.8554
2023-10-23 18:04:47,015 ----------------------------------------------------------------------------------------------------
2023-10-23 18:04:47,016 Loading model from best epoch ...
2023-10-23 18:04:48,769 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object
2023-10-23 18:04:49,584
Results:
- F-score (micro) 0.8005
- F-score (macro) 0.6213
- Accuracy 0.6765
By class:
precision recall f1-score support
scope 0.7455 0.8146 0.7785 151
work 0.7478 0.9053 0.8190 95
pers 0.7788 0.9167 0.8421 96
loc 0.6667 0.6667 0.6667 3
date 0.0000 0.0000 0.0000 3
micro avg 0.7494 0.8592 0.8005 348
macro avg 0.5877 0.6606 0.6213 348
weighted avg 0.7482 0.8592 0.7994 348
2023-10-23 18:04:49,584 ----------------------------------------------------------------------------------------------------