2023-10-25 21:20:06,302 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:06,303 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(64001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-25 21:20:06,303 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:06,303 MultiCorpus: 1085 train + 148 dev + 364 test sentences - NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator 2023-10-25 21:20:06,303 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:06,303 Train: 1085 sentences 2023-10-25 21:20:06,303 (train_with_dev=False, train_with_test=False) 2023-10-25 21:20:06,303 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:06,303 Training Params: 2023-10-25 21:20:06,303 - learning_rate: "3e-05" 2023-10-25 21:20:06,303 - mini_batch_size: "8" 2023-10-25 21:20:06,304 - max_epochs: "10" 2023-10-25 21:20:06,304 - shuffle: "True" 2023-10-25 21:20:06,304 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:06,304 Plugins: 2023-10-25 21:20:06,304 - TensorboardLogger 2023-10-25 21:20:06,304 - LinearScheduler | warmup_fraction: '0.1' 2023-10-25 21:20:06,304 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:06,304 Final evaluation on model from best epoch (best-model.pt) 2023-10-25 21:20:06,304 - metric: "('micro avg', 'f1-score')" 2023-10-25 21:20:06,304 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:06,304 Computation: 2023-10-25 21:20:06,304 - compute on device: cuda:0 2023-10-25 21:20:06,304 - embedding storage: none 2023-10-25 21:20:06,304 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:06,304 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-25 21:20:06,304 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:06,304 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:06,304 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-25 21:20:07,355 epoch 1 - iter 13/136 - loss 3.19537858 - time (sec): 1.05 - samples/sec: 5006.81 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:20:08,259 epoch 1 - iter 26/136 - loss 2.76016229 - time (sec): 1.95 - samples/sec: 5426.73 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:20:09,146 epoch 1 - iter 39/136 - loss 2.31898621 - time (sec): 2.84 - samples/sec: 5352.27 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:20:10,159 epoch 1 - iter 52/136 - loss 1.87763258 - time (sec): 3.85 - samples/sec: 5221.03 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:20:11,143 epoch 1 - iter 65/136 - loss 1.62815715 - time (sec): 4.84 - samples/sec: 5007.79 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:20:12,209 epoch 1 - iter 78/136 - loss 1.39999778 - time (sec): 5.90 - samples/sec: 5074.14 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:20:13,409 epoch 1 - iter 91/136 - loss 1.24281698 - time (sec): 7.10 - samples/sec: 4954.26 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:20:14,325 epoch 1 - iter 104/136 - loss 1.13361333 - time (sec): 8.02 - samples/sec: 4962.06 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:20:15,286 epoch 1 - iter 117/136 - loss 1.04110203 - time (sec): 8.98 - samples/sec: 4950.04 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:20:16,373 epoch 1 - iter 130/136 - loss 0.96009861 - time (sec): 10.07 - samples/sec: 4950.16 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:20:16,839 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:16,839 EPOCH 1 done: loss 0.9272 - lr: 0.000028 2023-10-25 21:20:17,482 DEV : loss 0.16158084571361542 - f1-score (micro avg) 0.6429 2023-10-25 21:20:17,488 saving best model 2023-10-25 21:20:17,990 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:18,887 epoch 2 - iter 13/136 - loss 0.14478029 - time (sec): 0.90 - samples/sec: 5170.78 - lr: 0.000030 - momentum: 0.000000 2023-10-25 21:20:20,273 epoch 2 - iter 26/136 - loss 0.13903227 - time (sec): 2.28 - samples/sec: 4372.66 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:20:21,307 epoch 2 - iter 39/136 - loss 0.14383107 - time (sec): 3.32 - samples/sec: 4307.00 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:20:22,354 epoch 2 - iter 52/136 - loss 0.14741814 - time (sec): 4.36 - samples/sec: 4615.83 - lr: 0.000029 - momentum: 0.000000 2023-10-25 21:20:23,357 epoch 2 - iter 65/136 - loss 0.14244389 - time (sec): 5.37 - samples/sec: 4763.95 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:20:24,327 epoch 2 - iter 78/136 - loss 0.14786401 - time (sec): 6.34 - samples/sec: 4760.82 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:20:25,424 epoch 2 - iter 91/136 - loss 0.14575958 - time (sec): 7.43 - samples/sec: 4838.69 - lr: 0.000028 - momentum: 0.000000 2023-10-25 21:20:26,475 epoch 2 - iter 104/136 - loss 0.14340023 - time (sec): 8.48 - samples/sec: 4900.45 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:20:27,349 epoch 2 - iter 117/136 - loss 0.14202521 - time (sec): 9.36 - samples/sec: 4899.16 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:20:28,296 epoch 2 - iter 130/136 - loss 0.14295048 - time (sec): 10.31 - samples/sec: 4888.77 - lr: 0.000027 - momentum: 0.000000 2023-10-25 21:20:28,683 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:28,683 EPOCH 2 done: loss 0.1434 - lr: 0.000027 2023-10-25 21:20:29,924 DEV : loss 0.09990036487579346 - f1-score (micro avg) 0.7468 2023-10-25 21:20:29,930 saving best model 2023-10-25 21:20:30,644 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:31,648 epoch 3 - iter 13/136 - loss 0.11960243 - time (sec): 1.00 - samples/sec: 4942.09 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:20:32,586 epoch 3 - iter 26/136 - loss 0.09759495 - time (sec): 1.94 - samples/sec: 5254.25 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:20:33,650 epoch 3 - iter 39/136 - loss 0.08584252 - time (sec): 3.00 - samples/sec: 5065.02 - lr: 0.000026 - momentum: 0.000000 2023-10-25 21:20:34,640 epoch 3 - iter 52/136 - loss 0.08045591 - time (sec): 3.99 - samples/sec: 5160.60 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:20:35,645 epoch 3 - iter 65/136 - loss 0.07965884 - time (sec): 5.00 - samples/sec: 5163.72 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:20:36,647 epoch 3 - iter 78/136 - loss 0.07654363 - time (sec): 6.00 - samples/sec: 5105.24 - lr: 0.000025 - momentum: 0.000000 2023-10-25 21:20:37,586 epoch 3 - iter 91/136 - loss 0.07736097 - time (sec): 6.94 - samples/sec: 5017.45 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:20:38,660 epoch 3 - iter 104/136 - loss 0.07706710 - time (sec): 8.01 - samples/sec: 4990.12 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:20:39,700 epoch 3 - iter 117/136 - loss 0.07804906 - time (sec): 9.05 - samples/sec: 4981.63 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:20:40,611 epoch 3 - iter 130/136 - loss 0.07904440 - time (sec): 9.97 - samples/sec: 4955.02 - lr: 0.000024 - momentum: 0.000000 2023-10-25 21:20:41,066 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:41,066 EPOCH 3 done: loss 0.0780 - lr: 0.000024 2023-10-25 21:20:42,197 DEV : loss 0.10422874242067337 - f1-score (micro avg) 0.7585 2023-10-25 21:20:42,203 saving best model 2023-10-25 21:20:42,876 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:44,128 epoch 4 - iter 13/136 - loss 0.04067450 - time (sec): 1.25 - samples/sec: 4058.14 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:20:45,027 epoch 4 - iter 26/136 - loss 0.05076903 - time (sec): 2.15 - samples/sec: 4540.53 - lr: 0.000023 - momentum: 0.000000 2023-10-25 21:20:46,128 epoch 4 - iter 39/136 - loss 0.04630433 - time (sec): 3.25 - samples/sec: 4867.37 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:20:47,087 epoch 4 - iter 52/136 - loss 0.04968443 - time (sec): 4.21 - samples/sec: 4902.47 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:20:48,116 epoch 4 - iter 65/136 - loss 0.04696549 - time (sec): 5.24 - samples/sec: 4932.60 - lr: 0.000022 - momentum: 0.000000 2023-10-25 21:20:49,083 epoch 4 - iter 78/136 - loss 0.04539194 - time (sec): 6.21 - samples/sec: 4925.62 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:20:50,011 epoch 4 - iter 91/136 - loss 0.04798111 - time (sec): 7.13 - samples/sec: 5005.83 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:20:50,929 epoch 4 - iter 104/136 - loss 0.04633553 - time (sec): 8.05 - samples/sec: 5037.74 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:20:51,856 epoch 4 - iter 117/136 - loss 0.04647949 - time (sec): 8.98 - samples/sec: 5046.46 - lr: 0.000021 - momentum: 0.000000 2023-10-25 21:20:52,836 epoch 4 - iter 130/136 - loss 0.04697246 - time (sec): 9.96 - samples/sec: 5034.81 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:20:53,211 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:53,211 EPOCH 4 done: loss 0.0462 - lr: 0.000020 2023-10-25 21:20:54,399 DEV : loss 0.10547740757465363 - f1-score (micro avg) 0.7802 2023-10-25 21:20:54,405 saving best model 2023-10-25 21:20:55,135 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:20:56,129 epoch 5 - iter 13/136 - loss 0.01663346 - time (sec): 0.99 - samples/sec: 5001.88 - lr: 0.000020 - momentum: 0.000000 2023-10-25 21:20:57,031 epoch 5 - iter 26/136 - loss 0.02629321 - time (sec): 1.89 - samples/sec: 4870.26 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:20:58,002 epoch 5 - iter 39/136 - loss 0.02794944 - time (sec): 2.86 - samples/sec: 4938.10 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:20:58,982 epoch 5 - iter 52/136 - loss 0.02661567 - time (sec): 3.84 - samples/sec: 5046.85 - lr: 0.000019 - momentum: 0.000000 2023-10-25 21:20:59,832 epoch 5 - iter 65/136 - loss 0.02489471 - time (sec): 4.69 - samples/sec: 4924.08 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:21:00,804 epoch 5 - iter 78/136 - loss 0.02726968 - time (sec): 5.66 - samples/sec: 4960.06 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:21:01,828 epoch 5 - iter 91/136 - loss 0.02863633 - time (sec): 6.69 - samples/sec: 5056.21 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:21:02,735 epoch 5 - iter 104/136 - loss 0.02720610 - time (sec): 7.60 - samples/sec: 5102.12 - lr: 0.000018 - momentum: 0.000000 2023-10-25 21:21:03,693 epoch 5 - iter 117/136 - loss 0.02758476 - time (sec): 8.55 - samples/sec: 5141.63 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:21:04,781 epoch 5 - iter 130/136 - loss 0.02952936 - time (sec): 9.64 - samples/sec: 5162.52 - lr: 0.000017 - momentum: 0.000000 2023-10-25 21:21:05,187 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:21:05,187 EPOCH 5 done: loss 0.0292 - lr: 0.000017 2023-10-25 21:21:06,376 DEV : loss 0.12244772911071777 - f1-score (micro avg) 0.7927 2023-10-25 21:21:06,382 saving best model 2023-10-25 21:21:07,096 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:21:08,438 epoch 6 - iter 13/136 - loss 0.01683556 - time (sec): 1.34 - samples/sec: 3876.93 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:21:09,366 epoch 6 - iter 26/136 - loss 0.01690694 - time (sec): 2.27 - samples/sec: 4564.47 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:21:10,317 epoch 6 - iter 39/136 - loss 0.01687007 - time (sec): 3.22 - samples/sec: 4624.06 - lr: 0.000016 - momentum: 0.000000 2023-10-25 21:21:11,401 epoch 6 - iter 52/136 - loss 0.01889602 - time (sec): 4.30 - samples/sec: 4578.36 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:21:12,369 epoch 6 - iter 65/136 - loss 0.01614339 - time (sec): 5.27 - samples/sec: 4657.62 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:21:13,359 epoch 6 - iter 78/136 - loss 0.01655885 - time (sec): 6.26 - samples/sec: 4801.57 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:21:14,288 epoch 6 - iter 91/136 - loss 0.01946751 - time (sec): 7.19 - samples/sec: 4851.68 - lr: 0.000015 - momentum: 0.000000 2023-10-25 21:21:15,295 epoch 6 - iter 104/136 - loss 0.01892046 - time (sec): 8.19 - samples/sec: 4848.33 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:21:16,230 epoch 6 - iter 117/136 - loss 0.01916421 - time (sec): 9.13 - samples/sec: 4873.62 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:21:17,157 epoch 6 - iter 130/136 - loss 0.01967563 - time (sec): 10.06 - samples/sec: 4894.41 - lr: 0.000014 - momentum: 0.000000 2023-10-25 21:21:17,580 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:21:17,580 EPOCH 6 done: loss 0.0194 - lr: 0.000014 2023-10-25 21:21:18,716 DEV : loss 0.1296090930700302 - f1-score (micro avg) 0.7934 2023-10-25 21:21:18,722 saving best model 2023-10-25 21:21:19,434 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:21:20,348 epoch 7 - iter 13/136 - loss 0.01080744 - time (sec): 0.91 - samples/sec: 5668.49 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:21:21,361 epoch 7 - iter 26/136 - loss 0.01106692 - time (sec): 1.92 - samples/sec: 5326.04 - lr: 0.000013 - momentum: 0.000000 2023-10-25 21:21:22,288 epoch 7 - iter 39/136 - loss 0.01073739 - time (sec): 2.85 - samples/sec: 5389.66 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:21:23,287 epoch 7 - iter 52/136 - loss 0.01029492 - time (sec): 3.85 - samples/sec: 5354.39 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:21:24,361 epoch 7 - iter 65/136 - loss 0.01122951 - time (sec): 4.92 - samples/sec: 5259.47 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:21:25,294 epoch 7 - iter 78/136 - loss 0.01059452 - time (sec): 5.86 - samples/sec: 5231.06 - lr: 0.000012 - momentum: 0.000000 2023-10-25 21:21:26,352 epoch 7 - iter 91/136 - loss 0.01186817 - time (sec): 6.92 - samples/sec: 5155.37 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:21:27,230 epoch 7 - iter 104/136 - loss 0.01209798 - time (sec): 7.79 - samples/sec: 5208.33 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:21:28,112 epoch 7 - iter 117/136 - loss 0.01394009 - time (sec): 8.68 - samples/sec: 5185.34 - lr: 0.000011 - momentum: 0.000000 2023-10-25 21:21:29,033 epoch 7 - iter 130/136 - loss 0.01372476 - time (sec): 9.60 - samples/sec: 5200.08 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:21:29,485 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:21:29,486 EPOCH 7 done: loss 0.0136 - lr: 0.000010 2023-10-25 21:21:30,725 DEV : loss 0.1454724222421646 - f1-score (micro avg) 0.792 2023-10-25 21:21:30,732 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:21:32,074 epoch 8 - iter 13/136 - loss 0.01218010 - time (sec): 1.34 - samples/sec: 4384.56 - lr: 0.000010 - momentum: 0.000000 2023-10-25 21:21:33,025 epoch 8 - iter 26/136 - loss 0.01693779 - time (sec): 2.29 - samples/sec: 4772.46 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:21:34,093 epoch 8 - iter 39/136 - loss 0.01703926 - time (sec): 3.36 - samples/sec: 4856.76 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:21:34,973 epoch 8 - iter 52/136 - loss 0.01542773 - time (sec): 4.24 - samples/sec: 4941.74 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:21:35,998 epoch 8 - iter 65/136 - loss 0.01460635 - time (sec): 5.26 - samples/sec: 4954.47 - lr: 0.000009 - momentum: 0.000000 2023-10-25 21:21:36,935 epoch 8 - iter 78/136 - loss 0.01371676 - time (sec): 6.20 - samples/sec: 5011.86 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:21:37,953 epoch 8 - iter 91/136 - loss 0.01319613 - time (sec): 7.22 - samples/sec: 4994.88 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:21:38,901 epoch 8 - iter 104/136 - loss 0.01203454 - time (sec): 8.17 - samples/sec: 5027.05 - lr: 0.000008 - momentum: 0.000000 2023-10-25 21:21:39,909 epoch 8 - iter 117/136 - loss 0.01120187 - time (sec): 9.18 - samples/sec: 4985.48 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:21:40,947 epoch 8 - iter 130/136 - loss 0.01051438 - time (sec): 10.21 - samples/sec: 4914.41 - lr: 0.000007 - momentum: 0.000000 2023-10-25 21:21:41,391 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:21:41,392 EPOCH 8 done: loss 0.0106 - lr: 0.000007 2023-10-25 21:21:42,626 DEV : loss 0.1543840914964676 - f1-score (micro avg) 0.8059 2023-10-25 21:21:42,632 saving best model 2023-10-25 21:21:43,341 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:21:44,407 epoch 9 - iter 13/136 - loss 0.00445394 - time (sec): 1.06 - samples/sec: 4658.17 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:21:45,462 epoch 9 - iter 26/136 - loss 0.00733912 - time (sec): 2.12 - samples/sec: 5035.08 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:21:46,388 epoch 9 - iter 39/136 - loss 0.00679164 - time (sec): 3.05 - samples/sec: 5053.37 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:21:47,268 epoch 9 - iter 52/136 - loss 0.00759414 - time (sec): 3.93 - samples/sec: 4980.00 - lr: 0.000006 - momentum: 0.000000 2023-10-25 21:21:48,269 epoch 9 - iter 65/136 - loss 0.00752308 - time (sec): 4.93 - samples/sec: 5085.53 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:21:49,176 epoch 9 - iter 78/136 - loss 0.00768783 - time (sec): 5.83 - samples/sec: 5083.57 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:21:50,246 epoch 9 - iter 91/136 - loss 0.00732468 - time (sec): 6.90 - samples/sec: 5165.41 - lr: 0.000005 - momentum: 0.000000 2023-10-25 21:21:51,141 epoch 9 - iter 104/136 - loss 0.00684016 - time (sec): 7.80 - samples/sec: 5111.40 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:21:52,117 epoch 9 - iter 117/136 - loss 0.00700631 - time (sec): 8.77 - samples/sec: 5067.64 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:21:53,131 epoch 9 - iter 130/136 - loss 0.00730047 - time (sec): 9.79 - samples/sec: 5056.15 - lr: 0.000004 - momentum: 0.000000 2023-10-25 21:21:53,632 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:21:53,632 EPOCH 9 done: loss 0.0075 - lr: 0.000004 2023-10-25 21:21:54,814 DEV : loss 0.16468970477581024 - f1-score (micro avg) 0.8059 2023-10-25 21:21:54,821 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:21:56,267 epoch 10 - iter 13/136 - loss 0.01036517 - time (sec): 1.44 - samples/sec: 3369.07 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:21:57,242 epoch 10 - iter 26/136 - loss 0.01231050 - time (sec): 2.42 - samples/sec: 4055.70 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:21:58,278 epoch 10 - iter 39/136 - loss 0.00954086 - time (sec): 3.45 - samples/sec: 4327.08 - lr: 0.000003 - momentum: 0.000000 2023-10-25 21:21:59,305 epoch 10 - iter 52/136 - loss 0.00847034 - time (sec): 4.48 - samples/sec: 4621.28 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:22:00,240 epoch 10 - iter 65/136 - loss 0.00800394 - time (sec): 5.42 - samples/sec: 4682.11 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:22:01,177 epoch 10 - iter 78/136 - loss 0.00757495 - time (sec): 6.35 - samples/sec: 4711.89 - lr: 0.000002 - momentum: 0.000000 2023-10-25 21:22:02,192 epoch 10 - iter 91/136 - loss 0.00773310 - time (sec): 7.37 - samples/sec: 4804.46 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:22:03,206 epoch 10 - iter 104/136 - loss 0.00705573 - time (sec): 8.38 - samples/sec: 4813.82 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:22:04,042 epoch 10 - iter 117/136 - loss 0.00676122 - time (sec): 9.22 - samples/sec: 4834.45 - lr: 0.000001 - momentum: 0.000000 2023-10-25 21:22:05,026 epoch 10 - iter 130/136 - loss 0.00658402 - time (sec): 10.20 - samples/sec: 4874.95 - lr: 0.000000 - momentum: 0.000000 2023-10-25 21:22:05,439 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:05,439 EPOCH 10 done: loss 0.0069 - lr: 0.000000 2023-10-25 21:22:06,638 DEV : loss 0.16538743674755096 - f1-score (micro avg) 0.7877 2023-10-25 21:22:07,172 ---------------------------------------------------------------------------------------------------- 2023-10-25 21:22:07,174 Loading model from best epoch ... 2023-10-25 21:22:09,130 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-25 21:22:11,237 Results: - F-score (micro) 0.7892 - F-score (macro) 0.7524 - Accuracy 0.664 By class: precision recall f1-score support LOC 0.8201 0.8622 0.8406 312 PER 0.6973 0.8750 0.7761 208 ORG 0.5000 0.4545 0.4762 55 HumanProd 0.8462 1.0000 0.9167 22 micro avg 0.7489 0.8342 0.7892 597 macro avg 0.7159 0.7979 0.7524 597 weighted avg 0.7488 0.8342 0.7874 597 2023-10-25 21:22:11,237 ----------------------------------------------------------------------------------------------------