|
2023-10-11 13:27:33,976 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:27:33,978 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 13:27:33,978 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:27:33,978 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-11 13:27:33,978 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:27:33,979 Train: 1085 sentences |
|
2023-10-11 13:27:33,979 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 13:27:33,979 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:27:33,979 Training Params: |
|
2023-10-11 13:27:33,979 - learning_rate: "0.00016" |
|
2023-10-11 13:27:33,979 - mini_batch_size: "4" |
|
2023-10-11 13:27:33,979 - max_epochs: "10" |
|
2023-10-11 13:27:33,979 - shuffle: "True" |
|
2023-10-11 13:27:33,979 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:27:33,979 Plugins: |
|
2023-10-11 13:27:33,979 - TensorboardLogger |
|
2023-10-11 13:27:33,979 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 13:27:33,979 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:27:33,979 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 13:27:33,980 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 13:27:33,980 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:27:33,980 Computation: |
|
2023-10-11 13:27:33,980 - compute on device: cuda:0 |
|
2023-10-11 13:27:33,980 - embedding storage: none |
|
2023-10-11 13:27:33,980 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:27:33,980 Model training base path: "hmbench-newseye/sv-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-5" |
|
2023-10-11 13:27:33,980 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:27:33,980 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:27:33,980 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 13:27:43,531 epoch 1 - iter 27/272 - loss 2.81999244 - time (sec): 9.55 - samples/sec: 591.04 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 13:27:52,452 epoch 1 - iter 54/272 - loss 2.81118479 - time (sec): 18.47 - samples/sec: 561.40 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-11 13:28:01,343 epoch 1 - iter 81/272 - loss 2.79096698 - time (sec): 27.36 - samples/sec: 545.67 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 13:28:11,435 epoch 1 - iter 108/272 - loss 2.71614989 - time (sec): 37.45 - samples/sec: 566.63 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 13:28:20,784 epoch 1 - iter 135/272 - loss 2.62925889 - time (sec): 46.80 - samples/sec: 564.79 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-11 13:28:30,396 epoch 1 - iter 162/272 - loss 2.52390517 - time (sec): 56.41 - samples/sec: 562.22 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 13:28:39,473 epoch 1 - iter 189/272 - loss 2.41837798 - time (sec): 65.49 - samples/sec: 559.85 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-11 13:28:48,377 epoch 1 - iter 216/272 - loss 2.31158997 - time (sec): 74.40 - samples/sec: 556.49 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 13:28:57,118 epoch 1 - iter 243/272 - loss 2.21251411 - time (sec): 83.14 - samples/sec: 551.90 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 13:29:07,861 epoch 1 - iter 270/272 - loss 2.06737558 - time (sec): 93.88 - samples/sec: 552.07 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 13:29:08,291 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:29:08,292 EPOCH 1 done: loss 2.0636 - lr: 0.000158 |
|
2023-10-11 13:29:13,402 DEV : loss 0.7253165245056152 - f1-score (micro avg) 0.0 |
|
2023-10-11 13:29:13,409 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:29:22,469 epoch 2 - iter 27/272 - loss 0.73318608 - time (sec): 9.06 - samples/sec: 526.52 - lr: 0.000158 - momentum: 0.000000 |
|
2023-10-11 13:29:32,473 epoch 2 - iter 54/272 - loss 0.66531252 - time (sec): 19.06 - samples/sec: 526.20 - lr: 0.000157 - momentum: 0.000000 |
|
2023-10-11 13:29:42,956 epoch 2 - iter 81/272 - loss 0.64023609 - time (sec): 29.54 - samples/sec: 512.28 - lr: 0.000155 - momentum: 0.000000 |
|
2023-10-11 13:29:53,234 epoch 2 - iter 108/272 - loss 0.59586906 - time (sec): 39.82 - samples/sec: 523.70 - lr: 0.000153 - momentum: 0.000000 |
|
2023-10-11 13:30:03,244 epoch 2 - iter 135/272 - loss 0.57638010 - time (sec): 49.83 - samples/sec: 520.32 - lr: 0.000151 - momentum: 0.000000 |
|
2023-10-11 13:30:13,822 epoch 2 - iter 162/272 - loss 0.55581269 - time (sec): 60.41 - samples/sec: 519.74 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-11 13:30:23,650 epoch 2 - iter 189/272 - loss 0.52874034 - time (sec): 70.24 - samples/sec: 522.78 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 13:30:32,781 epoch 2 - iter 216/272 - loss 0.50817052 - time (sec): 79.37 - samples/sec: 520.59 - lr: 0.000146 - momentum: 0.000000 |
|
2023-10-11 13:30:42,130 epoch 2 - iter 243/272 - loss 0.49516648 - time (sec): 88.72 - samples/sec: 520.92 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-11 13:30:52,007 epoch 2 - iter 270/272 - loss 0.47371239 - time (sec): 98.60 - samples/sec: 523.10 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 13:30:52,637 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:30:52,637 EPOCH 2 done: loss 0.4728 - lr: 0.000142 |
|
2023-10-11 13:30:58,739 DEV : loss 0.2730832099914551 - f1-score (micro avg) 0.458 |
|
2023-10-11 13:30:58,747 saving best model |
|
2023-10-11 13:30:59,637 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:31:08,898 epoch 3 - iter 27/272 - loss 0.32431702 - time (sec): 9.26 - samples/sec: 502.47 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-11 13:31:20,050 epoch 3 - iter 54/272 - loss 0.29067923 - time (sec): 20.41 - samples/sec: 546.54 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-11 13:31:30,352 epoch 3 - iter 81/272 - loss 0.27571265 - time (sec): 30.71 - samples/sec: 544.60 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 13:31:39,766 epoch 3 - iter 108/272 - loss 0.27279233 - time (sec): 40.13 - samples/sec: 531.75 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 13:31:49,454 epoch 3 - iter 135/272 - loss 0.26429444 - time (sec): 49.81 - samples/sec: 530.94 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-11 13:31:59,524 epoch 3 - iter 162/272 - loss 0.26191426 - time (sec): 59.88 - samples/sec: 533.41 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 13:32:08,926 epoch 3 - iter 189/272 - loss 0.25973280 - time (sec): 69.29 - samples/sec: 530.00 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 13:32:18,526 epoch 3 - iter 216/272 - loss 0.25380917 - time (sec): 78.89 - samples/sec: 529.89 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 13:32:27,975 epoch 3 - iter 243/272 - loss 0.24522414 - time (sec): 88.34 - samples/sec: 527.87 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-11 13:32:37,515 epoch 3 - iter 270/272 - loss 0.24504080 - time (sec): 97.88 - samples/sec: 529.60 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 13:32:37,912 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:32:37,912 EPOCH 3 done: loss 0.2450 - lr: 0.000125 |
|
2023-10-11 13:32:43,468 DEV : loss 0.17829611897468567 - f1-score (micro avg) 0.6035 |
|
2023-10-11 13:32:43,475 saving best model |
|
2023-10-11 13:32:46,048 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:32:55,066 epoch 4 - iter 27/272 - loss 0.17217050 - time (sec): 9.01 - samples/sec: 523.11 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 13:33:04,366 epoch 4 - iter 54/272 - loss 0.16719853 - time (sec): 18.31 - samples/sec: 549.27 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-11 13:33:14,336 epoch 4 - iter 81/272 - loss 0.15878769 - time (sec): 28.28 - samples/sec: 564.10 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-11 13:33:23,963 epoch 4 - iter 108/272 - loss 0.15319958 - time (sec): 37.91 - samples/sec: 565.67 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 13:33:32,824 epoch 4 - iter 135/272 - loss 0.15294648 - time (sec): 46.77 - samples/sec: 563.12 - lr: 0.000116 - momentum: 0.000000 |
|
2023-10-11 13:33:42,222 epoch 4 - iter 162/272 - loss 0.14383828 - time (sec): 56.17 - samples/sec: 565.86 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-11 13:33:51,342 epoch 4 - iter 189/272 - loss 0.14409440 - time (sec): 65.29 - samples/sec: 561.27 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 13:34:00,576 epoch 4 - iter 216/272 - loss 0.14278202 - time (sec): 74.52 - samples/sec: 562.45 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 13:34:09,522 epoch 4 - iter 243/272 - loss 0.14476740 - time (sec): 83.47 - samples/sec: 561.39 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-11 13:34:18,777 epoch 4 - iter 270/272 - loss 0.14220216 - time (sec): 92.72 - samples/sec: 558.81 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 13:34:19,205 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:34:19,206 EPOCH 4 done: loss 0.1425 - lr: 0.000107 |
|
2023-10-11 13:34:24,887 DEV : loss 0.13749206066131592 - f1-score (micro avg) 0.6835 |
|
2023-10-11 13:34:24,895 saving best model |
|
2023-10-11 13:34:27,471 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:34:37,467 epoch 5 - iter 27/272 - loss 0.13559335 - time (sec): 9.99 - samples/sec: 585.00 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 13:34:47,240 epoch 5 - iter 54/272 - loss 0.13141056 - time (sec): 19.76 - samples/sec: 567.27 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-11 13:34:55,969 epoch 5 - iter 81/272 - loss 0.12138647 - time (sec): 28.49 - samples/sec: 545.04 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-11 13:35:05,652 epoch 5 - iter 108/272 - loss 0.11370414 - time (sec): 38.18 - samples/sec: 542.72 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 13:35:14,285 epoch 5 - iter 135/272 - loss 0.11107776 - time (sec): 46.81 - samples/sec: 534.80 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 13:35:24,070 epoch 5 - iter 162/272 - loss 0.10256163 - time (sec): 56.59 - samples/sec: 538.09 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-11 13:35:33,336 epoch 5 - iter 189/272 - loss 0.10008297 - time (sec): 65.86 - samples/sec: 539.87 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-11 13:35:43,685 epoch 5 - iter 216/272 - loss 0.10184709 - time (sec): 76.21 - samples/sec: 546.33 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 13:35:53,096 epoch 5 - iter 243/272 - loss 0.09643962 - time (sec): 85.62 - samples/sec: 541.20 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-11 13:36:02,816 epoch 5 - iter 270/272 - loss 0.09603504 - time (sec): 95.34 - samples/sec: 538.26 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-11 13:36:03,663 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:36:03,663 EPOCH 5 done: loss 0.0953 - lr: 0.000089 |
|
2023-10-11 13:36:09,473 DEV : loss 0.12472429126501083 - f1-score (micro avg) 0.7786 |
|
2023-10-11 13:36:09,481 saving best model |
|
2023-10-11 13:36:12,337 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:36:23,136 epoch 6 - iter 27/272 - loss 0.07520327 - time (sec): 10.80 - samples/sec: 505.01 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 13:36:32,097 epoch 6 - iter 54/272 - loss 0.07291755 - time (sec): 19.76 - samples/sec: 502.56 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 13:36:41,923 epoch 6 - iter 81/272 - loss 0.07828051 - time (sec): 29.58 - samples/sec: 521.33 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-11 13:36:51,453 epoch 6 - iter 108/272 - loss 0.07753702 - time (sec): 39.11 - samples/sec: 524.34 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 13:37:01,082 epoch 6 - iter 135/272 - loss 0.07120305 - time (sec): 48.74 - samples/sec: 526.52 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 13:37:10,272 epoch 6 - iter 162/272 - loss 0.07288072 - time (sec): 57.93 - samples/sec: 524.83 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 13:37:19,939 epoch 6 - iter 189/272 - loss 0.06861992 - time (sec): 67.60 - samples/sec: 526.14 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 13:37:30,055 epoch 6 - iter 216/272 - loss 0.06950290 - time (sec): 77.71 - samples/sec: 531.23 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 13:37:39,592 epoch 6 - iter 243/272 - loss 0.06815041 - time (sec): 87.25 - samples/sec: 529.93 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-11 13:37:49,504 epoch 6 - iter 270/272 - loss 0.06691943 - time (sec): 97.16 - samples/sec: 532.72 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-11 13:37:49,963 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:37:49,963 EPOCH 6 done: loss 0.0675 - lr: 0.000071 |
|
2023-10-11 13:37:55,754 DEV : loss 0.13598552346229553 - f1-score (micro avg) 0.7818 |
|
2023-10-11 13:37:55,762 saving best model |
|
2023-10-11 13:37:58,488 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:38:08,928 epoch 7 - iter 27/272 - loss 0.06175330 - time (sec): 10.44 - samples/sec: 570.42 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-11 13:38:18,475 epoch 7 - iter 54/272 - loss 0.05347701 - time (sec): 19.98 - samples/sec: 564.63 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 13:38:28,109 epoch 7 - iter 81/272 - loss 0.05882144 - time (sec): 29.62 - samples/sec: 548.17 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-11 13:38:37,129 epoch 7 - iter 108/272 - loss 0.05357931 - time (sec): 38.64 - samples/sec: 542.35 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-11 13:38:46,616 epoch 7 - iter 135/272 - loss 0.05501228 - time (sec): 48.12 - samples/sec: 549.00 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 13:38:56,744 epoch 7 - iter 162/272 - loss 0.05044097 - time (sec): 58.25 - samples/sec: 553.71 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-11 13:39:06,470 epoch 7 - iter 189/272 - loss 0.04992894 - time (sec): 67.98 - samples/sec: 550.74 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-11 13:39:15,738 epoch 7 - iter 216/272 - loss 0.05389474 - time (sec): 77.25 - samples/sec: 548.65 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 13:39:24,510 epoch 7 - iter 243/272 - loss 0.05148651 - time (sec): 86.02 - samples/sec: 538.50 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 13:39:34,229 epoch 7 - iter 270/272 - loss 0.04901036 - time (sec): 95.74 - samples/sec: 540.38 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-11 13:39:34,706 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:39:34,706 EPOCH 7 done: loss 0.0490 - lr: 0.000054 |
|
2023-10-11 13:39:40,220 DEV : loss 0.13726186752319336 - f1-score (micro avg) 0.7912 |
|
2023-10-11 13:39:40,229 saving best model |
|
2023-10-11 13:39:42,826 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:39:52,045 epoch 8 - iter 27/272 - loss 0.02987934 - time (sec): 9.21 - samples/sec: 517.77 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 13:40:00,972 epoch 8 - iter 54/272 - loss 0.03485461 - time (sec): 18.14 - samples/sec: 514.85 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 13:40:10,553 epoch 8 - iter 81/272 - loss 0.03793773 - time (sec): 27.72 - samples/sec: 523.19 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 13:40:21,107 epoch 8 - iter 108/272 - loss 0.03671922 - time (sec): 38.28 - samples/sec: 539.15 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-11 13:40:30,261 epoch 8 - iter 135/272 - loss 0.03973438 - time (sec): 47.43 - samples/sec: 533.66 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 13:40:39,815 epoch 8 - iter 162/272 - loss 0.03971603 - time (sec): 56.98 - samples/sec: 536.81 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-11 13:40:49,558 epoch 8 - iter 189/272 - loss 0.04039939 - time (sec): 66.73 - samples/sec: 538.65 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-11 13:40:58,977 epoch 8 - iter 216/272 - loss 0.03807732 - time (sec): 76.15 - samples/sec: 540.79 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-11 13:41:08,486 epoch 8 - iter 243/272 - loss 0.03765314 - time (sec): 85.66 - samples/sec: 544.18 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-11 13:41:17,813 epoch 8 - iter 270/272 - loss 0.03745509 - time (sec): 94.98 - samples/sec: 546.55 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-11 13:41:18,153 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:41:18,153 EPOCH 8 done: loss 0.0377 - lr: 0.000036 |
|
2023-10-11 13:41:24,087 DEV : loss 0.1399751901626587 - f1-score (micro avg) 0.7963 |
|
2023-10-11 13:41:24,095 saving best model |
|
2023-10-11 13:41:26,673 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:41:35,890 epoch 9 - iter 27/272 - loss 0.03181528 - time (sec): 9.21 - samples/sec: 549.22 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-11 13:41:45,533 epoch 9 - iter 54/272 - loss 0.03782289 - time (sec): 18.86 - samples/sec: 552.08 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 13:41:55,101 epoch 9 - iter 81/272 - loss 0.03522035 - time (sec): 28.42 - samples/sec: 556.79 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 13:42:04,329 epoch 9 - iter 108/272 - loss 0.03337439 - time (sec): 37.65 - samples/sec: 554.47 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-11 13:42:13,770 epoch 9 - iter 135/272 - loss 0.03263354 - time (sec): 47.09 - samples/sec: 555.29 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 13:42:23,590 epoch 9 - iter 162/272 - loss 0.03181414 - time (sec): 56.91 - samples/sec: 553.60 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 13:42:32,803 epoch 9 - iter 189/272 - loss 0.03172942 - time (sec): 66.13 - samples/sec: 551.87 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 13:42:41,941 epoch 9 - iter 216/272 - loss 0.03342724 - time (sec): 75.26 - samples/sec: 550.32 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-11 13:42:51,133 epoch 9 - iter 243/272 - loss 0.03123165 - time (sec): 84.46 - samples/sec: 550.62 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 13:43:00,428 epoch 9 - iter 270/272 - loss 0.03083400 - time (sec): 93.75 - samples/sec: 549.52 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-11 13:43:01,120 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:43:01,121 EPOCH 9 done: loss 0.0310 - lr: 0.000018 |
|
2023-10-11 13:43:06,714 DEV : loss 0.13962143659591675 - f1-score (micro avg) 0.7919 |
|
2023-10-11 13:43:06,723 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:43:16,268 epoch 10 - iter 27/272 - loss 0.02047195 - time (sec): 9.54 - samples/sec: 540.05 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-11 13:43:25,152 epoch 10 - iter 54/272 - loss 0.01777598 - time (sec): 18.43 - samples/sec: 532.04 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-11 13:43:35,144 epoch 10 - iter 81/272 - loss 0.02262355 - time (sec): 28.42 - samples/sec: 549.70 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 13:43:44,586 epoch 10 - iter 108/272 - loss 0.02189460 - time (sec): 37.86 - samples/sec: 553.18 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-11 13:43:54,049 epoch 10 - iter 135/272 - loss 0.02502590 - time (sec): 47.32 - samples/sec: 559.33 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-11 13:44:04,410 epoch 10 - iter 162/272 - loss 0.02888403 - time (sec): 57.69 - samples/sec: 570.21 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 13:44:12,664 epoch 10 - iter 189/272 - loss 0.02872227 - time (sec): 65.94 - samples/sec: 558.21 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 13:44:22,060 epoch 10 - iter 216/272 - loss 0.02809091 - time (sec): 75.34 - samples/sec: 555.91 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-11 13:44:31,176 epoch 10 - iter 243/272 - loss 0.02744388 - time (sec): 84.45 - samples/sec: 553.40 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 13:44:40,474 epoch 10 - iter 270/272 - loss 0.02739646 - time (sec): 93.75 - samples/sec: 550.50 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 13:44:41,072 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:44:41,073 EPOCH 10 done: loss 0.0273 - lr: 0.000000 |
|
2023-10-11 13:44:46,865 DEV : loss 0.14164641499519348 - f1-score (micro avg) 0.7839 |
|
2023-10-11 13:44:47,739 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 13:44:47,741 Loading model from best epoch ... |
|
2023-10-11 13:44:52,960 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-11 13:45:05,711 |
|
Results: |
|
- F-score (micro) 0.7695 |
|
- F-score (macro) 0.7198 |
|
- Accuracy 0.6441 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7924 0.8686 0.8287 312 |
|
PER 0.6743 0.8462 0.7505 208 |
|
ORG 0.4746 0.5091 0.4912 55 |
|
HumanProd 0.7600 0.8636 0.8085 22 |
|
|
|
micro avg 0.7191 0.8275 0.7695 597 |
|
macro avg 0.6753 0.7719 0.7198 597 |
|
weighted avg 0.7208 0.8275 0.7697 597 |
|
|
|
2023-10-11 13:45:05,712 ---------------------------------------------------------------------------------------------------- |
|
|