2023-10-17 08:48:55,885 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:48:55,886 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 08:48:55,886 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:48:55,886 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-17 08:48:55,886 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:48:55,886 Train: 1100 sentences 2023-10-17 08:48:55,886 (train_with_dev=False, train_with_test=False) 2023-10-17 08:48:55,886 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:48:55,886 Training Params: 2023-10-17 08:48:55,886 - learning_rate: "5e-05" 2023-10-17 08:48:55,886 - mini_batch_size: "4" 2023-10-17 08:48:55,886 - max_epochs: "10" 2023-10-17 08:48:55,886 - shuffle: "True" 2023-10-17 08:48:55,886 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:48:55,887 Plugins: 2023-10-17 08:48:55,887 - TensorboardLogger 2023-10-17 08:48:55,887 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 08:48:55,887 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:48:55,887 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 08:48:55,887 - metric: "('micro avg', 'f1-score')" 2023-10-17 08:48:55,887 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:48:55,887 Computation: 2023-10-17 08:48:55,887 - compute on device: cuda:0 2023-10-17 08:48:55,887 - embedding storage: none 2023-10-17 08:48:55,887 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:48:55,887 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-17 08:48:55,887 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:48:55,887 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:48:55,887 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 08:48:57,189 epoch 1 - iter 27/275 - loss 3.82248822 - time (sec): 1.30 - samples/sec: 1812.72 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:48:58,499 epoch 1 - iter 54/275 - loss 3.00028749 - time (sec): 2.61 - samples/sec: 1646.40 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:48:59,783 epoch 1 - iter 81/275 - loss 2.32531297 - time (sec): 3.90 - samples/sec: 1712.74 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:49:01,016 epoch 1 - iter 108/275 - loss 1.90971994 - time (sec): 5.13 - samples/sec: 1705.45 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:49:02,232 epoch 1 - iter 135/275 - loss 1.61165235 - time (sec): 6.34 - samples/sec: 1723.79 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:49:03,455 epoch 1 - iter 162/275 - loss 1.39866068 - time (sec): 7.57 - samples/sec: 1746.50 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:49:04,648 epoch 1 - iter 189/275 - loss 1.23507602 - time (sec): 8.76 - samples/sec: 1795.24 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:49:05,818 epoch 1 - iter 216/275 - loss 1.10155245 - time (sec): 9.93 - samples/sec: 1837.18 - lr: 0.000039 - momentum: 0.000000 2023-10-17 08:49:06,970 epoch 1 - iter 243/275 - loss 1.02083629 - time (sec): 11.08 - samples/sec: 1819.41 - lr: 0.000044 - momentum: 0.000000 2023-10-17 08:49:08,216 epoch 1 - iter 270/275 - loss 0.94422994 - time (sec): 12.33 - samples/sec: 1813.98 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:49:08,435 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:49:08,435 EPOCH 1 done: loss 0.9301 - lr: 0.000049 2023-10-17 08:49:09,128 DEV : loss 0.18998880684375763 - f1-score (micro avg) 0.7718 2023-10-17 08:49:09,133 saving best model 2023-10-17 08:49:09,453 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:49:10,646 epoch 2 - iter 27/275 - loss 0.19771864 - time (sec): 1.19 - samples/sec: 1897.13 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:49:11,838 epoch 2 - iter 54/275 - loss 0.18078864 - time (sec): 2.38 - samples/sec: 1879.34 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:49:13,069 epoch 2 - iter 81/275 - loss 0.16888107 - time (sec): 3.61 - samples/sec: 1844.34 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:49:14,296 epoch 2 - iter 108/275 - loss 0.17030036 - time (sec): 4.84 - samples/sec: 1868.12 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:49:15,500 epoch 2 - iter 135/275 - loss 0.16720867 - time (sec): 6.05 - samples/sec: 1855.67 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:49:16,715 epoch 2 - iter 162/275 - loss 0.16947469 - time (sec): 7.26 - samples/sec: 1863.99 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:49:17,929 epoch 2 - iter 189/275 - loss 0.17047367 - time (sec): 8.47 - samples/sec: 1859.20 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:49:19,167 epoch 2 - iter 216/275 - loss 0.17531921 - time (sec): 9.71 - samples/sec: 1888.06 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:49:20,375 epoch 2 - iter 243/275 - loss 0.17237634 - time (sec): 10.92 - samples/sec: 1877.41 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:49:21,602 epoch 2 - iter 270/275 - loss 0.16957074 - time (sec): 12.15 - samples/sec: 1848.01 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:49:21,825 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:49:21,825 EPOCH 2 done: loss 0.1709 - lr: 0.000045 2023-10-17 08:49:22,458 DEV : loss 0.18550686538219452 - f1-score (micro avg) 0.824 2023-10-17 08:49:22,462 saving best model 2023-10-17 08:49:22,891 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:49:24,207 epoch 3 - iter 27/275 - loss 0.12270099 - time (sec): 1.31 - samples/sec: 1606.52 - lr: 0.000044 - momentum: 0.000000 2023-10-17 08:49:25,506 epoch 3 - iter 54/275 - loss 0.09803233 - time (sec): 2.61 - samples/sec: 1509.06 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:49:26,764 epoch 3 - iter 81/275 - loss 0.09418740 - time (sec): 3.87 - samples/sec: 1601.17 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:49:27,994 epoch 3 - iter 108/275 - loss 0.10543827 - time (sec): 5.10 - samples/sec: 1699.99 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:49:29,225 epoch 3 - iter 135/275 - loss 0.10796688 - time (sec): 6.33 - samples/sec: 1699.71 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:49:30,466 epoch 3 - iter 162/275 - loss 0.10881893 - time (sec): 7.57 - samples/sec: 1728.72 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:49:31,684 epoch 3 - iter 189/275 - loss 0.11070236 - time (sec): 8.79 - samples/sec: 1734.97 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:49:32,894 epoch 3 - iter 216/275 - loss 0.10809719 - time (sec): 10.00 - samples/sec: 1755.49 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:49:34,147 epoch 3 - iter 243/275 - loss 0.10810593 - time (sec): 11.25 - samples/sec: 1771.20 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:49:35,385 epoch 3 - iter 270/275 - loss 0.11303403 - time (sec): 12.49 - samples/sec: 1784.32 - lr: 0.000039 - momentum: 0.000000 2023-10-17 08:49:35,619 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:49:35,619 EPOCH 3 done: loss 0.1122 - lr: 0.000039 2023-10-17 08:49:36,325 DEV : loss 0.16965429484844208 - f1-score (micro avg) 0.8448 2023-10-17 08:49:36,329 saving best model 2023-10-17 08:49:36,790 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:49:38,021 epoch 4 - iter 27/275 - loss 0.06300598 - time (sec): 1.23 - samples/sec: 1727.72 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:49:39,256 epoch 4 - iter 54/275 - loss 0.05146376 - time (sec): 2.46 - samples/sec: 1758.25 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:49:40,479 epoch 4 - iter 81/275 - loss 0.09438839 - time (sec): 3.69 - samples/sec: 1757.01 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:49:41,724 epoch 4 - iter 108/275 - loss 0.09296037 - time (sec): 4.93 - samples/sec: 1769.01 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:49:42,952 epoch 4 - iter 135/275 - loss 0.09238656 - time (sec): 6.16 - samples/sec: 1761.09 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:49:44,188 epoch 4 - iter 162/275 - loss 0.09545995 - time (sec): 7.40 - samples/sec: 1796.30 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:49:45,411 epoch 4 - iter 189/275 - loss 0.09007918 - time (sec): 8.62 - samples/sec: 1787.86 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:49:46,643 epoch 4 - iter 216/275 - loss 0.08952483 - time (sec): 9.85 - samples/sec: 1815.73 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:49:47,879 epoch 4 - iter 243/275 - loss 0.08857733 - time (sec): 11.09 - samples/sec: 1813.51 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:49:49,136 epoch 4 - iter 270/275 - loss 0.09220419 - time (sec): 12.34 - samples/sec: 1804.50 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:49:49,356 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:49:49,356 EPOCH 4 done: loss 0.0907 - lr: 0.000034 2023-10-17 08:49:49,995 DEV : loss 0.1864178627729416 - f1-score (micro avg) 0.8524 2023-10-17 08:49:50,000 saving best model 2023-10-17 08:49:50,426 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:49:51,685 epoch 5 - iter 27/275 - loss 0.12790019 - time (sec): 1.26 - samples/sec: 1934.72 - lr: 0.000033 - momentum: 0.000000 2023-10-17 08:49:52,890 epoch 5 - iter 54/275 - loss 0.09482667 - time (sec): 2.46 - samples/sec: 1865.38 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:49:54,116 epoch 5 - iter 81/275 - loss 0.08039190 - time (sec): 3.69 - samples/sec: 1794.41 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:49:55,316 epoch 5 - iter 108/275 - loss 0.08969888 - time (sec): 4.89 - samples/sec: 1755.60 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:49:56,531 epoch 5 - iter 135/275 - loss 0.08769977 - time (sec): 6.10 - samples/sec: 1784.72 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:49:57,760 epoch 5 - iter 162/275 - loss 0.07990395 - time (sec): 7.33 - samples/sec: 1791.48 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:49:58,991 epoch 5 - iter 189/275 - loss 0.07643210 - time (sec): 8.56 - samples/sec: 1804.80 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:50:00,220 epoch 5 - iter 216/275 - loss 0.07245098 - time (sec): 9.79 - samples/sec: 1830.89 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:50:01,431 epoch 5 - iter 243/275 - loss 0.06950953 - time (sec): 11.00 - samples/sec: 1830.96 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:50:02,649 epoch 5 - iter 270/275 - loss 0.06779922 - time (sec): 12.22 - samples/sec: 1832.61 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:50:02,877 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:50:02,878 EPOCH 5 done: loss 0.0669 - lr: 0.000028 2023-10-17 08:50:03,511 DEV : loss 0.16903042793273926 - f1-score (micro avg) 0.8663 2023-10-17 08:50:03,516 saving best model 2023-10-17 08:50:03,970 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:50:05,172 epoch 6 - iter 27/275 - loss 0.05910721 - time (sec): 1.20 - samples/sec: 1666.81 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:50:06,397 epoch 6 - iter 54/275 - loss 0.06915115 - time (sec): 2.42 - samples/sec: 1731.26 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:50:07,625 epoch 6 - iter 81/275 - loss 0.06797154 - time (sec): 3.65 - samples/sec: 1780.83 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:50:08,844 epoch 6 - iter 108/275 - loss 0.06531264 - time (sec): 4.87 - samples/sec: 1774.06 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:50:10,056 epoch 6 - iter 135/275 - loss 0.05964926 - time (sec): 6.08 - samples/sec: 1797.51 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:50:11,287 epoch 6 - iter 162/275 - loss 0.05173399 - time (sec): 7.31 - samples/sec: 1794.59 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:50:12,522 epoch 6 - iter 189/275 - loss 0.05150710 - time (sec): 8.55 - samples/sec: 1794.17 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:50:13,760 epoch 6 - iter 216/275 - loss 0.05236797 - time (sec): 9.79 - samples/sec: 1799.74 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:50:15,008 epoch 6 - iter 243/275 - loss 0.05636757 - time (sec): 11.04 - samples/sec: 1819.67 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:50:16,240 epoch 6 - iter 270/275 - loss 0.05370161 - time (sec): 12.27 - samples/sec: 1816.80 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:50:16,473 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:50:16,473 EPOCH 6 done: loss 0.0527 - lr: 0.000022 2023-10-17 08:50:17,163 DEV : loss 0.19920983910560608 - f1-score (micro avg) 0.8738 2023-10-17 08:50:17,168 saving best model 2023-10-17 08:50:17,614 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:50:18,885 epoch 7 - iter 27/275 - loss 0.06705101 - time (sec): 1.27 - samples/sec: 1625.34 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:50:20,102 epoch 7 - iter 54/275 - loss 0.05309652 - time (sec): 2.48 - samples/sec: 1775.45 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:50:21,360 epoch 7 - iter 81/275 - loss 0.06157296 - time (sec): 3.74 - samples/sec: 1789.79 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:50:22,608 epoch 7 - iter 108/275 - loss 0.05165244 - time (sec): 4.99 - samples/sec: 1789.51 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:50:23,836 epoch 7 - iter 135/275 - loss 0.04265718 - time (sec): 6.22 - samples/sec: 1776.49 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:50:25,066 epoch 7 - iter 162/275 - loss 0.03858431 - time (sec): 7.45 - samples/sec: 1798.34 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:50:26,303 epoch 7 - iter 189/275 - loss 0.03652655 - time (sec): 8.69 - samples/sec: 1826.50 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:50:27,502 epoch 7 - iter 216/275 - loss 0.03525941 - time (sec): 9.89 - samples/sec: 1818.61 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:50:28,704 epoch 7 - iter 243/275 - loss 0.03571872 - time (sec): 11.09 - samples/sec: 1807.42 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:50:29,908 epoch 7 - iter 270/275 - loss 0.03739856 - time (sec): 12.29 - samples/sec: 1817.32 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:50:30,140 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:50:30,140 EPOCH 7 done: loss 0.0367 - lr: 0.000017 2023-10-17 08:50:30,838 DEV : loss 0.18746249377727509 - f1-score (micro avg) 0.8761 2023-10-17 08:50:30,843 saving best model 2023-10-17 08:50:31,282 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:50:32,564 epoch 8 - iter 27/275 - loss 0.00694663 - time (sec): 1.28 - samples/sec: 1887.76 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:50:33,784 epoch 8 - iter 54/275 - loss 0.01940040 - time (sec): 2.50 - samples/sec: 1857.57 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:50:34,995 epoch 8 - iter 81/275 - loss 0.01675556 - time (sec): 3.71 - samples/sec: 1850.34 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:50:36,228 epoch 8 - iter 108/275 - loss 0.01499351 - time (sec): 4.94 - samples/sec: 1821.51 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:50:37,467 epoch 8 - iter 135/275 - loss 0.01379214 - time (sec): 6.18 - samples/sec: 1870.49 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:50:38,686 epoch 8 - iter 162/275 - loss 0.01536753 - time (sec): 7.40 - samples/sec: 1842.69 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:50:39,932 epoch 8 - iter 189/275 - loss 0.01547651 - time (sec): 8.65 - samples/sec: 1840.68 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:50:41,149 epoch 8 - iter 216/275 - loss 0.02440394 - time (sec): 9.86 - samples/sec: 1835.57 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:50:42,403 epoch 8 - iter 243/275 - loss 0.02452509 - time (sec): 11.12 - samples/sec: 1827.30 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:50:43,610 epoch 8 - iter 270/275 - loss 0.02315233 - time (sec): 12.33 - samples/sec: 1819.39 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:50:43,845 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:50:43,845 EPOCH 8 done: loss 0.0236 - lr: 0.000011 2023-10-17 08:50:44,503 DEV : loss 0.18050651252269745 - f1-score (micro avg) 0.8824 2023-10-17 08:50:44,507 saving best model 2023-10-17 08:50:44,952 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:50:46,212 epoch 9 - iter 27/275 - loss 0.02741986 - time (sec): 1.26 - samples/sec: 1900.36 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:50:47,442 epoch 9 - iter 54/275 - loss 0.02099426 - time (sec): 2.49 - samples/sec: 1934.93 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:50:48,640 epoch 9 - iter 81/275 - loss 0.01543202 - time (sec): 3.68 - samples/sec: 1866.88 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:50:49,860 epoch 9 - iter 108/275 - loss 0.01638367 - time (sec): 4.90 - samples/sec: 1823.06 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:50:51,188 epoch 9 - iter 135/275 - loss 0.01630896 - time (sec): 6.23 - samples/sec: 1762.81 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:50:52,422 epoch 9 - iter 162/275 - loss 0.02265358 - time (sec): 7.47 - samples/sec: 1777.55 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:50:53,675 epoch 9 - iter 189/275 - loss 0.02334752 - time (sec): 8.72 - samples/sec: 1778.80 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:50:54,914 epoch 9 - iter 216/275 - loss 0.02140354 - time (sec): 9.96 - samples/sec: 1769.99 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:50:56,137 epoch 9 - iter 243/275 - loss 0.02042862 - time (sec): 11.18 - samples/sec: 1781.86 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:50:57,400 epoch 9 - iter 270/275 - loss 0.02027750 - time (sec): 12.44 - samples/sec: 1790.44 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:50:57,651 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:50:57,651 EPOCH 9 done: loss 0.0200 - lr: 0.000006 2023-10-17 08:50:58,324 DEV : loss 0.18982398509979248 - f1-score (micro avg) 0.8806 2023-10-17 08:50:58,329 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:50:59,559 epoch 10 - iter 27/275 - loss 0.00426859 - time (sec): 1.23 - samples/sec: 1738.20 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:51:00,828 epoch 10 - iter 54/275 - loss 0.00929609 - time (sec): 2.50 - samples/sec: 1823.60 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:51:02,068 epoch 10 - iter 81/275 - loss 0.01188040 - time (sec): 3.74 - samples/sec: 1806.78 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:51:03,343 epoch 10 - iter 108/275 - loss 0.00961700 - time (sec): 5.01 - samples/sec: 1725.88 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:51:04,569 epoch 10 - iter 135/275 - loss 0.00945368 - time (sec): 6.24 - samples/sec: 1738.12 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:51:05,792 epoch 10 - iter 162/275 - loss 0.01019720 - time (sec): 7.46 - samples/sec: 1749.50 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:51:07,080 epoch 10 - iter 189/275 - loss 0.00913857 - time (sec): 8.75 - samples/sec: 1773.33 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:51:08,377 epoch 10 - iter 216/275 - loss 0.01442529 - time (sec): 10.05 - samples/sec: 1774.17 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:51:09,613 epoch 10 - iter 243/275 - loss 0.01322089 - time (sec): 11.28 - samples/sec: 1784.49 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:51:10,831 epoch 10 - iter 270/275 - loss 0.01241629 - time (sec): 12.50 - samples/sec: 1787.20 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:51:11,066 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:11,066 EPOCH 10 done: loss 0.0122 - lr: 0.000000 2023-10-17 08:51:11,702 DEV : loss 0.19630618393421173 - f1-score (micro avg) 0.8854 2023-10-17 08:51:11,707 saving best model 2023-10-17 08:51:12,484 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:51:12,485 Loading model from best epoch ... 2023-10-17 08:51:13,852 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 08:51:14,649 Results: - F-score (micro) 0.9048 - F-score (macro) 0.74 - Accuracy 0.8341 By class: precision recall f1-score support scope 0.8947 0.8693 0.8818 176 pers 0.9843 0.9766 0.9804 128 work 0.8378 0.8378 0.8378 74 object 1.0000 1.0000 1.0000 2 loc 0.0000 0.0000 0.0000 2 micro avg 0.9144 0.8953 0.9048 382 macro avg 0.7434 0.7367 0.7400 382 weighted avg 0.9096 0.8953 0.9023 382 2023-10-17 08:51:14,649 ----------------------------------------------------------------------------------------------------