|
2023-10-11 05:01:34,727 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:01:34,729 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-11 05:01:34,729 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:01:34,729 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-11 05:01:34,729 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:01:34,729 Train: 7142 sentences |
|
2023-10-11 05:01:34,729 (train_with_dev=False, train_with_test=False) |
|
2023-10-11 05:01:34,729 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:01:34,730 Training Params: |
|
2023-10-11 05:01:34,730 - learning_rate: "0.00015" |
|
2023-10-11 05:01:34,730 - mini_batch_size: "4" |
|
2023-10-11 05:01:34,730 - max_epochs: "10" |
|
2023-10-11 05:01:34,730 - shuffle: "True" |
|
2023-10-11 05:01:34,730 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:01:34,730 Plugins: |
|
2023-10-11 05:01:34,730 - TensorboardLogger |
|
2023-10-11 05:01:34,730 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-11 05:01:34,730 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:01:34,730 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-11 05:01:34,730 - metric: "('micro avg', 'f1-score')" |
|
2023-10-11 05:01:34,730 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:01:34,730 Computation: |
|
2023-10-11 05:01:34,730 - compute on device: cuda:0 |
|
2023-10-11 05:01:34,730 - embedding storage: none |
|
2023-10-11 05:01:34,731 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:01:34,731 Model training base path: "hmbench-newseye/fr-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-11 05:01:34,731 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:01:34,731 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:01:34,731 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-11 05:02:27,827 epoch 1 - iter 178/1786 - loss 2.83041151 - time (sec): 53.09 - samples/sec: 500.15 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 05:03:20,133 epoch 1 - iter 356/1786 - loss 2.69429508 - time (sec): 105.40 - samples/sec: 491.24 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 05:04:14,338 epoch 1 - iter 534/1786 - loss 2.41681822 - time (sec): 159.61 - samples/sec: 486.67 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 05:05:10,271 epoch 1 - iter 712/1786 - loss 2.10924834 - time (sec): 215.54 - samples/sec: 477.63 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 05:06:05,087 epoch 1 - iter 890/1786 - loss 1.84410867 - time (sec): 270.35 - samples/sec: 470.72 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 05:07:05,666 epoch 1 - iter 1068/1786 - loss 1.63794432 - time (sec): 330.93 - samples/sec: 462.45 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-11 05:08:04,516 epoch 1 - iter 1246/1786 - loss 1.47552627 - time (sec): 389.78 - samples/sec: 456.44 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 05:09:00,303 epoch 1 - iter 1424/1786 - loss 1.35086011 - time (sec): 445.57 - samples/sec: 451.33 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-11 05:09:52,316 epoch 1 - iter 1602/1786 - loss 1.24502390 - time (sec): 497.58 - samples/sec: 451.20 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-11 05:10:43,774 epoch 1 - iter 1780/1786 - loss 1.15419317 - time (sec): 549.04 - samples/sec: 452.16 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-11 05:10:45,213 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:10:45,213 EPOCH 1 done: loss 1.1525 - lr: 0.000149 |
|
2023-10-11 05:11:06,239 DEV : loss 0.20120850205421448 - f1-score (micro avg) 0.5061 |
|
2023-10-11 05:11:06,271 saving best model |
|
2023-10-11 05:11:07,380 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:12:00,838 epoch 2 - iter 178/1786 - loss 0.20647638 - time (sec): 53.46 - samples/sec: 470.14 - lr: 0.000148 - momentum: 0.000000 |
|
2023-10-11 05:12:55,186 epoch 2 - iter 356/1786 - loss 0.19196816 - time (sec): 107.80 - samples/sec: 477.92 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-11 05:13:49,188 epoch 2 - iter 534/1786 - loss 0.17945236 - time (sec): 161.81 - samples/sec: 471.84 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-11 05:14:41,335 epoch 2 - iter 712/1786 - loss 0.17099654 - time (sec): 213.95 - samples/sec: 469.79 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-11 05:15:33,535 epoch 2 - iter 890/1786 - loss 0.16314737 - time (sec): 266.15 - samples/sec: 468.04 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-11 05:16:27,194 epoch 2 - iter 1068/1786 - loss 0.15692538 - time (sec): 319.81 - samples/sec: 464.19 - lr: 0.000140 - momentum: 0.000000 |
|
2023-10-11 05:17:17,836 epoch 2 - iter 1246/1786 - loss 0.15088690 - time (sec): 370.45 - samples/sec: 464.60 - lr: 0.000138 - momentum: 0.000000 |
|
2023-10-11 05:18:09,337 epoch 2 - iter 1424/1786 - loss 0.14597758 - time (sec): 421.96 - samples/sec: 467.64 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-11 05:19:01,609 epoch 2 - iter 1602/1786 - loss 0.14238346 - time (sec): 474.23 - samples/sec: 471.09 - lr: 0.000135 - momentum: 0.000000 |
|
2023-10-11 05:19:52,592 epoch 2 - iter 1780/1786 - loss 0.13731741 - time (sec): 525.21 - samples/sec: 472.20 - lr: 0.000133 - momentum: 0.000000 |
|
2023-10-11 05:19:54,174 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:19:54,174 EPOCH 2 done: loss 0.1372 - lr: 0.000133 |
|
2023-10-11 05:20:14,463 DEV : loss 0.11661199480295181 - f1-score (micro avg) 0.7552 |
|
2023-10-11 05:20:14,492 saving best model |
|
2023-10-11 05:20:17,182 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:21:15,759 epoch 3 - iter 178/1786 - loss 0.07763298 - time (sec): 58.57 - samples/sec: 441.25 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-11 05:22:11,182 epoch 3 - iter 356/1786 - loss 0.07495553 - time (sec): 113.99 - samples/sec: 426.96 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-11 05:23:09,155 epoch 3 - iter 534/1786 - loss 0.07669306 - time (sec): 171.97 - samples/sec: 430.39 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-11 05:24:05,593 epoch 3 - iter 712/1786 - loss 0.07775822 - time (sec): 228.41 - samples/sec: 430.05 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-11 05:25:03,308 epoch 3 - iter 890/1786 - loss 0.07467629 - time (sec): 286.12 - samples/sec: 429.61 - lr: 0.000125 - momentum: 0.000000 |
|
2023-10-11 05:26:00,547 epoch 3 - iter 1068/1786 - loss 0.07681723 - time (sec): 343.36 - samples/sec: 431.47 - lr: 0.000123 - momentum: 0.000000 |
|
2023-10-11 05:26:58,181 epoch 3 - iter 1246/1786 - loss 0.07505527 - time (sec): 400.99 - samples/sec: 435.90 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-11 05:27:52,072 epoch 3 - iter 1424/1786 - loss 0.07490696 - time (sec): 454.88 - samples/sec: 438.92 - lr: 0.000120 - momentum: 0.000000 |
|
2023-10-11 05:28:46,503 epoch 3 - iter 1602/1786 - loss 0.07479295 - time (sec): 509.32 - samples/sec: 439.46 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-11 05:29:46,396 epoch 3 - iter 1780/1786 - loss 0.07517529 - time (sec): 569.21 - samples/sec: 436.00 - lr: 0.000117 - momentum: 0.000000 |
|
2023-10-11 05:29:48,337 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:29:48,338 EPOCH 3 done: loss 0.0754 - lr: 0.000117 |
|
2023-10-11 05:30:11,692 DEV : loss 0.1217304989695549 - f1-score (micro avg) 0.7823 |
|
2023-10-11 05:30:11,727 saving best model |
|
2023-10-11 05:30:14,519 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:31:12,835 epoch 4 - iter 178/1786 - loss 0.06283829 - time (sec): 58.31 - samples/sec: 424.49 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-11 05:32:06,563 epoch 4 - iter 356/1786 - loss 0.05265726 - time (sec): 112.04 - samples/sec: 426.97 - lr: 0.000113 - momentum: 0.000000 |
|
2023-10-11 05:33:01,812 epoch 4 - iter 534/1786 - loss 0.05130324 - time (sec): 167.29 - samples/sec: 441.84 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-11 05:33:57,758 epoch 4 - iter 712/1786 - loss 0.05156142 - time (sec): 223.24 - samples/sec: 450.76 - lr: 0.000110 - momentum: 0.000000 |
|
2023-10-11 05:34:50,238 epoch 4 - iter 890/1786 - loss 0.05132046 - time (sec): 275.72 - samples/sec: 448.93 - lr: 0.000108 - momentum: 0.000000 |
|
2023-10-11 05:35:46,545 epoch 4 - iter 1068/1786 - loss 0.05071305 - time (sec): 332.02 - samples/sec: 447.62 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-11 05:36:42,170 epoch 4 - iter 1246/1786 - loss 0.05180539 - time (sec): 387.65 - samples/sec: 450.20 - lr: 0.000105 - momentum: 0.000000 |
|
2023-10-11 05:37:38,584 epoch 4 - iter 1424/1786 - loss 0.05196249 - time (sec): 444.06 - samples/sec: 447.63 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-11 05:38:33,429 epoch 4 - iter 1602/1786 - loss 0.05160455 - time (sec): 498.91 - samples/sec: 447.19 - lr: 0.000102 - momentum: 0.000000 |
|
2023-10-11 05:39:31,504 epoch 4 - iter 1780/1786 - loss 0.05213671 - time (sec): 556.98 - samples/sec: 445.74 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-11 05:39:33,161 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:39:33,161 EPOCH 4 done: loss 0.0520 - lr: 0.000100 |
|
2023-10-11 05:39:55,740 DEV : loss 0.15838001668453217 - f1-score (micro avg) 0.7935 |
|
2023-10-11 05:39:55,771 saving best model |
|
2023-10-11 05:39:58,431 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:40:58,919 epoch 5 - iter 178/1786 - loss 0.04547964 - time (sec): 60.48 - samples/sec: 411.44 - lr: 0.000098 - momentum: 0.000000 |
|
2023-10-11 05:41:53,865 epoch 5 - iter 356/1786 - loss 0.04560730 - time (sec): 115.43 - samples/sec: 411.80 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-11 05:42:52,796 epoch 5 - iter 534/1786 - loss 0.04303715 - time (sec): 174.36 - samples/sec: 415.11 - lr: 0.000095 - momentum: 0.000000 |
|
2023-10-11 05:43:49,710 epoch 5 - iter 712/1786 - loss 0.04135883 - time (sec): 231.27 - samples/sec: 420.84 - lr: 0.000093 - momentum: 0.000000 |
|
2023-10-11 05:44:48,917 epoch 5 - iter 890/1786 - loss 0.04142354 - time (sec): 290.48 - samples/sec: 417.48 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-11 05:45:52,793 epoch 5 - iter 1068/1786 - loss 0.03982523 - time (sec): 354.36 - samples/sec: 412.77 - lr: 0.000090 - momentum: 0.000000 |
|
2023-10-11 05:47:01,510 epoch 5 - iter 1246/1786 - loss 0.04033156 - time (sec): 423.07 - samples/sec: 408.31 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-11 05:48:01,676 epoch 5 - iter 1424/1786 - loss 0.04015505 - time (sec): 483.24 - samples/sec: 408.92 - lr: 0.000087 - momentum: 0.000000 |
|
2023-10-11 05:48:58,632 epoch 5 - iter 1602/1786 - loss 0.03983915 - time (sec): 540.20 - samples/sec: 411.43 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-11 05:49:52,916 epoch 5 - iter 1780/1786 - loss 0.03991586 - time (sec): 594.48 - samples/sec: 417.29 - lr: 0.000083 - momentum: 0.000000 |
|
2023-10-11 05:49:54,495 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:49:54,496 EPOCH 5 done: loss 0.0398 - lr: 0.000083 |
|
2023-10-11 05:50:15,738 DEV : loss 0.1628066748380661 - f1-score (micro avg) 0.8089 |
|
2023-10-11 05:50:15,773 saving best model |
|
2023-10-11 05:50:18,533 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:51:12,723 epoch 6 - iter 178/1786 - loss 0.02789412 - time (sec): 54.19 - samples/sec: 456.91 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-11 05:52:07,670 epoch 6 - iter 356/1786 - loss 0.02958477 - time (sec): 109.13 - samples/sec: 456.59 - lr: 0.000080 - momentum: 0.000000 |
|
2023-10-11 05:52:59,150 epoch 6 - iter 534/1786 - loss 0.02757124 - time (sec): 160.61 - samples/sec: 463.60 - lr: 0.000078 - momentum: 0.000000 |
|
2023-10-11 05:53:50,942 epoch 6 - iter 712/1786 - loss 0.02729736 - time (sec): 212.41 - samples/sec: 467.18 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-11 05:54:43,400 epoch 6 - iter 890/1786 - loss 0.02840525 - time (sec): 264.86 - samples/sec: 468.78 - lr: 0.000075 - momentum: 0.000000 |
|
2023-10-11 05:55:37,498 epoch 6 - iter 1068/1786 - loss 0.02774450 - time (sec): 318.96 - samples/sec: 470.91 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-11 05:56:32,273 epoch 6 - iter 1246/1786 - loss 0.02655476 - time (sec): 373.74 - samples/sec: 466.63 - lr: 0.000072 - momentum: 0.000000 |
|
2023-10-11 05:57:26,142 epoch 6 - iter 1424/1786 - loss 0.02755150 - time (sec): 427.61 - samples/sec: 466.59 - lr: 0.000070 - momentum: 0.000000 |
|
2023-10-11 05:58:19,208 epoch 6 - iter 1602/1786 - loss 0.02776326 - time (sec): 480.67 - samples/sec: 468.70 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-11 05:59:09,786 epoch 6 - iter 1780/1786 - loss 0.02837332 - time (sec): 531.25 - samples/sec: 467.37 - lr: 0.000067 - momentum: 0.000000 |
|
2023-10-11 05:59:11,193 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 05:59:11,193 EPOCH 6 done: loss 0.0283 - lr: 0.000067 |
|
2023-10-11 05:59:32,445 DEV : loss 0.17363940179347992 - f1-score (micro avg) 0.8075 |
|
2023-10-11 05:59:32,474 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 06:00:27,457 epoch 7 - iter 178/1786 - loss 0.02062730 - time (sec): 54.98 - samples/sec: 497.62 - lr: 0.000065 - momentum: 0.000000 |
|
2023-10-11 06:01:26,025 epoch 7 - iter 356/1786 - loss 0.02080307 - time (sec): 113.55 - samples/sec: 457.95 - lr: 0.000063 - momentum: 0.000000 |
|
2023-10-11 06:02:18,555 epoch 7 - iter 534/1786 - loss 0.02116990 - time (sec): 166.08 - samples/sec: 455.85 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-11 06:03:10,462 epoch 7 - iter 712/1786 - loss 0.02046165 - time (sec): 217.99 - samples/sec: 458.64 - lr: 0.000060 - momentum: 0.000000 |
|
2023-10-11 06:04:02,581 epoch 7 - iter 890/1786 - loss 0.02124109 - time (sec): 270.10 - samples/sec: 460.37 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-11 06:04:55,342 epoch 7 - iter 1068/1786 - loss 0.01994612 - time (sec): 322.86 - samples/sec: 464.15 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-11 06:05:46,465 epoch 7 - iter 1246/1786 - loss 0.02017532 - time (sec): 373.99 - samples/sec: 464.41 - lr: 0.000055 - momentum: 0.000000 |
|
2023-10-11 06:06:39,403 epoch 7 - iter 1424/1786 - loss 0.01973361 - time (sec): 426.93 - samples/sec: 468.63 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-11 06:07:31,780 epoch 7 - iter 1602/1786 - loss 0.01962053 - time (sec): 479.30 - samples/sec: 468.69 - lr: 0.000052 - momentum: 0.000000 |
|
2023-10-11 06:08:23,880 epoch 7 - iter 1780/1786 - loss 0.02041587 - time (sec): 531.40 - samples/sec: 466.78 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-11 06:08:25,507 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 06:08:25,507 EPOCH 7 done: loss 0.0204 - lr: 0.000050 |
|
2023-10-11 06:08:46,785 DEV : loss 0.1936260312795639 - f1-score (micro avg) 0.8109 |
|
2023-10-11 06:08:46,815 saving best model |
|
2023-10-11 06:08:49,446 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 06:09:41,617 epoch 8 - iter 178/1786 - loss 0.01944064 - time (sec): 52.17 - samples/sec: 461.54 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-11 06:10:33,775 epoch 8 - iter 356/1786 - loss 0.01633750 - time (sec): 104.32 - samples/sec: 465.74 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-11 06:11:26,536 epoch 8 - iter 534/1786 - loss 0.01426110 - time (sec): 157.09 - samples/sec: 468.29 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-11 06:12:20,764 epoch 8 - iter 712/1786 - loss 0.01661153 - time (sec): 211.31 - samples/sec: 470.04 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-11 06:13:16,553 epoch 8 - iter 890/1786 - loss 0.01686518 - time (sec): 267.10 - samples/sec: 462.43 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-11 06:14:17,499 epoch 8 - iter 1068/1786 - loss 0.01792696 - time (sec): 328.05 - samples/sec: 452.14 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-11 06:15:22,194 epoch 8 - iter 1246/1786 - loss 0.01811955 - time (sec): 392.74 - samples/sec: 439.83 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-11 06:16:23,362 epoch 8 - iter 1424/1786 - loss 0.01715705 - time (sec): 453.91 - samples/sec: 437.14 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-11 06:17:23,666 epoch 8 - iter 1602/1786 - loss 0.01642572 - time (sec): 514.22 - samples/sec: 433.14 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-11 06:18:18,761 epoch 8 - iter 1780/1786 - loss 0.01609207 - time (sec): 569.31 - samples/sec: 435.25 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-11 06:18:20,641 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 06:18:20,641 EPOCH 8 done: loss 0.0161 - lr: 0.000033 |
|
2023-10-11 06:18:42,738 DEV : loss 0.21119572222232819 - f1-score (micro avg) 0.8059 |
|
2023-10-11 06:18:42,770 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 06:19:41,501 epoch 9 - iter 178/1786 - loss 0.01445335 - time (sec): 58.73 - samples/sec: 439.60 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-11 06:20:35,546 epoch 9 - iter 356/1786 - loss 0.01611300 - time (sec): 112.77 - samples/sec: 445.46 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-11 06:21:27,480 epoch 9 - iter 534/1786 - loss 0.01242337 - time (sec): 164.71 - samples/sec: 454.46 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-11 06:22:19,171 epoch 9 - iter 712/1786 - loss 0.01188148 - time (sec): 216.40 - samples/sec: 458.47 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-11 06:23:11,965 epoch 9 - iter 890/1786 - loss 0.01133948 - time (sec): 269.19 - samples/sec: 460.33 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-11 06:24:06,959 epoch 9 - iter 1068/1786 - loss 0.01089479 - time (sec): 324.19 - samples/sec: 458.23 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-11 06:24:58,784 epoch 9 - iter 1246/1786 - loss 0.01082033 - time (sec): 376.01 - samples/sec: 456.11 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-11 06:25:52,869 epoch 9 - iter 1424/1786 - loss 0.01140983 - time (sec): 430.10 - samples/sec: 458.01 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-11 06:26:48,015 epoch 9 - iter 1602/1786 - loss 0.01172453 - time (sec): 485.24 - samples/sec: 460.19 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-11 06:27:40,966 epoch 9 - iter 1780/1786 - loss 0.01123782 - time (sec): 538.19 - samples/sec: 460.59 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-11 06:27:42,742 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 06:27:42,743 EPOCH 9 done: loss 0.0112 - lr: 0.000017 |
|
2023-10-11 06:28:04,945 DEV : loss 0.222304567694664 - f1-score (micro avg) 0.8 |
|
2023-10-11 06:28:04,975 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 06:28:57,976 epoch 10 - iter 178/1786 - loss 0.00772526 - time (sec): 53.00 - samples/sec: 443.35 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-11 06:29:51,408 epoch 10 - iter 356/1786 - loss 0.00840680 - time (sec): 106.43 - samples/sec: 454.34 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-11 06:30:46,308 epoch 10 - iter 534/1786 - loss 0.00752016 - time (sec): 161.33 - samples/sec: 458.27 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-11 06:31:38,377 epoch 10 - iter 712/1786 - loss 0.00697489 - time (sec): 213.40 - samples/sec: 458.38 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-11 06:32:31,964 epoch 10 - iter 890/1786 - loss 0.00802755 - time (sec): 266.99 - samples/sec: 464.30 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-11 06:33:27,144 epoch 10 - iter 1068/1786 - loss 0.00896322 - time (sec): 322.17 - samples/sec: 467.72 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-11 06:34:18,958 epoch 10 - iter 1246/1786 - loss 0.00910142 - time (sec): 373.98 - samples/sec: 464.87 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-11 06:35:13,297 epoch 10 - iter 1424/1786 - loss 0.00868295 - time (sec): 428.32 - samples/sec: 465.41 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-11 06:36:07,043 epoch 10 - iter 1602/1786 - loss 0.00832301 - time (sec): 482.07 - samples/sec: 464.24 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-11 06:37:00,746 epoch 10 - iter 1780/1786 - loss 0.00823723 - time (sec): 535.77 - samples/sec: 463.07 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-11 06:37:02,365 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 06:37:02,365 EPOCH 10 done: loss 0.0082 - lr: 0.000000 |
|
2023-10-11 06:37:23,464 DEV : loss 0.22344990074634552 - f1-score (micro avg) 0.7963 |
|
2023-10-11 06:37:24,411 ---------------------------------------------------------------------------------------------------- |
|
2023-10-11 06:37:24,413 Loading model from best epoch ... |
|
2023-10-11 06:37:28,377 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-11 06:38:40,166 |
|
Results: |
|
- F-score (micro) 0.6861 |
|
- F-score (macro) 0.5975 |
|
- Accuracy 0.5426 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7197 0.6986 0.7090 1095 |
|
PER 0.7824 0.7638 0.7730 1012 |
|
ORG 0.3843 0.5770 0.4614 357 |
|
HumanProd 0.3443 0.6364 0.4468 33 |
|
|
|
micro avg 0.6665 0.7068 0.6861 2497 |
|
macro avg 0.5577 0.6690 0.5975 2497 |
|
weighted avg 0.6922 0.7068 0.6961 2497 |
|
|
|
2023-10-11 06:38:40,166 ---------------------------------------------------------------------------------------------------- |
|
|