|
2023-10-10 21:41:37,288 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:41:37,290 Model: "SequenceTagger( |
|
(embeddings): ByT5Embeddings( |
|
(model): T5EncoderModel( |
|
(shared): Embedding(384, 1472) |
|
(encoder): T5Stack( |
|
(embed_tokens): Embedding(384, 1472) |
|
(block): ModuleList( |
|
(0): T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
(relative_attention_bias): Embedding(32, 6) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(1-11): 11 x T5Block( |
|
(layer): ModuleList( |
|
(0): T5LayerSelfAttention( |
|
(SelfAttention): T5Attention( |
|
(q): Linear(in_features=1472, out_features=384, bias=False) |
|
(k): Linear(in_features=1472, out_features=384, bias=False) |
|
(v): Linear(in_features=1472, out_features=384, bias=False) |
|
(o): Linear(in_features=384, out_features=1472, bias=False) |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(1): T5LayerFF( |
|
(DenseReluDense): T5DenseGatedActDense( |
|
(wi_0): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wi_1): Linear(in_features=1472, out_features=3584, bias=False) |
|
(wo): Linear(in_features=3584, out_features=1472, bias=False) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
(act): NewGELUActivation() |
|
) |
|
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=1472, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-10 21:41:37,290 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:41:37,291 MultiCorpus: 1166 train + 165 dev + 415 test sentences |
|
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator |
|
2023-10-10 21:41:37,291 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:41:37,291 Train: 1166 sentences |
|
2023-10-10 21:41:37,291 (train_with_dev=False, train_with_test=False) |
|
2023-10-10 21:41:37,291 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:41:37,291 Training Params: |
|
2023-10-10 21:41:37,291 - learning_rate: "0.00015" |
|
2023-10-10 21:41:37,291 - mini_batch_size: "8" |
|
2023-10-10 21:41:37,291 - max_epochs: "10" |
|
2023-10-10 21:41:37,291 - shuffle: "True" |
|
2023-10-10 21:41:37,291 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:41:37,291 Plugins: |
|
2023-10-10 21:41:37,291 - TensorboardLogger |
|
2023-10-10 21:41:37,292 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-10 21:41:37,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:41:37,292 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-10 21:41:37,292 - metric: "('micro avg', 'f1-score')" |
|
2023-10-10 21:41:37,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:41:37,292 Computation: |
|
2023-10-10 21:41:37,292 - compute on device: cuda:0 |
|
2023-10-10 21:41:37,292 - embedding storage: none |
|
2023-10-10 21:41:37,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:41:37,292 Model training base path: "hmbench-newseye/fi-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00015-poolingfirst-layers-1-crfFalse-1" |
|
2023-10-10 21:41:37,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:41:37,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:41:37,292 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-10 21:41:47,892 epoch 1 - iter 14/146 - loss 2.83099006 - time (sec): 10.60 - samples/sec: 454.83 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-10 21:41:56,307 epoch 1 - iter 28/146 - loss 2.82672987 - time (sec): 19.01 - samples/sec: 453.39 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-10 21:42:05,412 epoch 1 - iter 42/146 - loss 2.81735922 - time (sec): 28.12 - samples/sec: 487.02 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-10 21:42:13,913 epoch 1 - iter 56/146 - loss 2.80389264 - time (sec): 36.62 - samples/sec: 492.89 - lr: 0.000057 - momentum: 0.000000 |
|
2023-10-10 21:42:21,745 epoch 1 - iter 70/146 - loss 2.78350842 - time (sec): 44.45 - samples/sec: 488.75 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-10 21:42:30,315 epoch 1 - iter 84/146 - loss 2.74176859 - time (sec): 53.02 - samples/sec: 484.59 - lr: 0.000085 - momentum: 0.000000 |
|
2023-10-10 21:42:38,433 epoch 1 - iter 98/146 - loss 2.68567253 - time (sec): 61.14 - samples/sec: 478.65 - lr: 0.000100 - momentum: 0.000000 |
|
2023-10-10 21:42:48,723 epoch 1 - iter 112/146 - loss 2.58862607 - time (sec): 71.43 - samples/sec: 486.62 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-10 21:42:57,392 epoch 1 - iter 126/146 - loss 2.51465264 - time (sec): 80.10 - samples/sec: 484.34 - lr: 0.000128 - momentum: 0.000000 |
|
2023-10-10 21:43:07,025 epoch 1 - iter 140/146 - loss 2.43562024 - time (sec): 89.73 - samples/sec: 478.39 - lr: 0.000143 - momentum: 0.000000 |
|
2023-10-10 21:43:10,647 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:43:10,648 EPOCH 1 done: loss 2.4069 - lr: 0.000143 |
|
2023-10-10 21:43:16,880 DEV : loss 1.356826901435852 - f1-score (micro avg) 0.0 |
|
2023-10-10 21:43:16,889 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:43:26,905 epoch 2 - iter 14/146 - loss 1.39370672 - time (sec): 10.01 - samples/sec: 457.36 - lr: 0.000149 - momentum: 0.000000 |
|
2023-10-10 21:43:36,699 epoch 2 - iter 28/146 - loss 1.35742229 - time (sec): 19.81 - samples/sec: 449.72 - lr: 0.000147 - momentum: 0.000000 |
|
2023-10-10 21:43:46,625 epoch 2 - iter 42/146 - loss 1.22339354 - time (sec): 29.73 - samples/sec: 441.39 - lr: 0.000145 - momentum: 0.000000 |
|
2023-10-10 21:43:56,477 epoch 2 - iter 56/146 - loss 1.12844539 - time (sec): 39.59 - samples/sec: 446.10 - lr: 0.000144 - momentum: 0.000000 |
|
2023-10-10 21:44:07,430 epoch 2 - iter 70/146 - loss 1.05603913 - time (sec): 50.54 - samples/sec: 453.57 - lr: 0.000142 - momentum: 0.000000 |
|
2023-10-10 21:44:16,779 epoch 2 - iter 84/146 - loss 0.99801684 - time (sec): 59.89 - samples/sec: 448.64 - lr: 0.000141 - momentum: 0.000000 |
|
2023-10-10 21:44:26,181 epoch 2 - iter 98/146 - loss 0.95042167 - time (sec): 69.29 - samples/sec: 443.73 - lr: 0.000139 - momentum: 0.000000 |
|
2023-10-10 21:44:35,483 epoch 2 - iter 112/146 - loss 0.91179574 - time (sec): 78.59 - samples/sec: 439.91 - lr: 0.000137 - momentum: 0.000000 |
|
2023-10-10 21:44:44,970 epoch 2 - iter 126/146 - loss 0.87949814 - time (sec): 88.08 - samples/sec: 440.06 - lr: 0.000136 - momentum: 0.000000 |
|
2023-10-10 21:44:54,351 epoch 2 - iter 140/146 - loss 0.84964841 - time (sec): 97.46 - samples/sec: 440.66 - lr: 0.000134 - momentum: 0.000000 |
|
2023-10-10 21:44:58,079 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:44:58,079 EPOCH 2 done: loss 0.8457 - lr: 0.000134 |
|
2023-10-10 21:45:04,659 DEV : loss 0.4102368950843811 - f1-score (micro avg) 0.0 |
|
2023-10-10 21:45:04,669 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:45:13,948 epoch 3 - iter 14/146 - loss 0.53677705 - time (sec): 9.28 - samples/sec: 380.64 - lr: 0.000132 - momentum: 0.000000 |
|
2023-10-10 21:45:23,837 epoch 3 - iter 28/146 - loss 0.43791549 - time (sec): 19.17 - samples/sec: 446.68 - lr: 0.000130 - momentum: 0.000000 |
|
2023-10-10 21:45:32,873 epoch 3 - iter 42/146 - loss 0.45683813 - time (sec): 28.20 - samples/sec: 453.06 - lr: 0.000129 - momentum: 0.000000 |
|
2023-10-10 21:45:41,710 epoch 3 - iter 56/146 - loss 0.44634295 - time (sec): 37.04 - samples/sec: 451.44 - lr: 0.000127 - momentum: 0.000000 |
|
2023-10-10 21:45:50,849 epoch 3 - iter 70/146 - loss 0.43680565 - time (sec): 46.18 - samples/sec: 454.80 - lr: 0.000126 - momentum: 0.000000 |
|
2023-10-10 21:45:59,653 epoch 3 - iter 84/146 - loss 0.43710977 - time (sec): 54.98 - samples/sec: 447.84 - lr: 0.000124 - momentum: 0.000000 |
|
2023-10-10 21:46:09,907 epoch 3 - iter 98/146 - loss 0.46074815 - time (sec): 65.24 - samples/sec: 457.86 - lr: 0.000122 - momentum: 0.000000 |
|
2023-10-10 21:46:19,746 epoch 3 - iter 112/146 - loss 0.44530575 - time (sec): 75.08 - samples/sec: 462.35 - lr: 0.000121 - momentum: 0.000000 |
|
2023-10-10 21:46:29,509 epoch 3 - iter 126/146 - loss 0.43292114 - time (sec): 84.84 - samples/sec: 461.28 - lr: 0.000119 - momentum: 0.000000 |
|
2023-10-10 21:46:37,992 epoch 3 - iter 140/146 - loss 0.43074177 - time (sec): 93.32 - samples/sec: 456.80 - lr: 0.000118 - momentum: 0.000000 |
|
2023-10-10 21:46:41,672 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:46:41,672 EPOCH 3 done: loss 0.4255 - lr: 0.000118 |
|
2023-10-10 21:46:47,458 DEV : loss 0.29022911190986633 - f1-score (micro avg) 0.0078 |
|
2023-10-10 21:46:47,467 saving best model |
|
2023-10-10 21:46:48,888 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:46:59,134 epoch 4 - iter 14/146 - loss 0.30368856 - time (sec): 10.24 - samples/sec: 462.62 - lr: 0.000115 - momentum: 0.000000 |
|
2023-10-10 21:47:08,517 epoch 4 - iter 28/146 - loss 0.28383264 - time (sec): 19.63 - samples/sec: 438.24 - lr: 0.000114 - momentum: 0.000000 |
|
2023-10-10 21:47:18,673 epoch 4 - iter 42/146 - loss 0.34186303 - time (sec): 29.78 - samples/sec: 452.85 - lr: 0.000112 - momentum: 0.000000 |
|
2023-10-10 21:47:28,077 epoch 4 - iter 56/146 - loss 0.34206921 - time (sec): 39.19 - samples/sec: 446.05 - lr: 0.000111 - momentum: 0.000000 |
|
2023-10-10 21:47:38,755 epoch 4 - iter 70/146 - loss 0.33835857 - time (sec): 49.87 - samples/sec: 443.34 - lr: 0.000109 - momentum: 0.000000 |
|
2023-10-10 21:47:48,562 epoch 4 - iter 84/146 - loss 0.33883733 - time (sec): 59.67 - samples/sec: 441.55 - lr: 0.000107 - momentum: 0.000000 |
|
2023-10-10 21:47:59,381 epoch 4 - iter 98/146 - loss 0.33112961 - time (sec): 70.49 - samples/sec: 439.76 - lr: 0.000106 - momentum: 0.000000 |
|
2023-10-10 21:48:08,664 epoch 4 - iter 112/146 - loss 0.32683644 - time (sec): 79.77 - samples/sec: 436.20 - lr: 0.000104 - momentum: 0.000000 |
|
2023-10-10 21:48:18,314 epoch 4 - iter 126/146 - loss 0.32139242 - time (sec): 89.42 - samples/sec: 434.36 - lr: 0.000103 - momentum: 0.000000 |
|
2023-10-10 21:48:27,775 epoch 4 - iter 140/146 - loss 0.32278398 - time (sec): 98.88 - samples/sec: 429.61 - lr: 0.000101 - momentum: 0.000000 |
|
2023-10-10 21:48:32,032 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:48:32,033 EPOCH 4 done: loss 0.3173 - lr: 0.000101 |
|
2023-10-10 21:48:38,292 DEV : loss 0.23751258850097656 - f1-score (micro avg) 0.3686 |
|
2023-10-10 21:48:38,302 saving best model |
|
2023-10-10 21:48:47,130 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:48:56,208 epoch 5 - iter 14/146 - loss 0.26938515 - time (sec): 9.07 - samples/sec: 453.04 - lr: 0.000099 - momentum: 0.000000 |
|
2023-10-10 21:49:06,203 epoch 5 - iter 28/146 - loss 0.33101405 - time (sec): 19.07 - samples/sec: 475.22 - lr: 0.000097 - momentum: 0.000000 |
|
2023-10-10 21:49:15,596 epoch 5 - iter 42/146 - loss 0.31691507 - time (sec): 28.46 - samples/sec: 466.90 - lr: 0.000096 - momentum: 0.000000 |
|
2023-10-10 21:49:24,647 epoch 5 - iter 56/146 - loss 0.28600023 - time (sec): 37.51 - samples/sec: 461.87 - lr: 0.000094 - momentum: 0.000000 |
|
2023-10-10 21:49:34,897 epoch 5 - iter 70/146 - loss 0.27334427 - time (sec): 47.76 - samples/sec: 459.23 - lr: 0.000092 - momentum: 0.000000 |
|
2023-10-10 21:49:44,423 epoch 5 - iter 84/146 - loss 0.26564120 - time (sec): 57.29 - samples/sec: 452.58 - lr: 0.000091 - momentum: 0.000000 |
|
2023-10-10 21:49:53,632 epoch 5 - iter 98/146 - loss 0.26142888 - time (sec): 66.50 - samples/sec: 454.26 - lr: 0.000089 - momentum: 0.000000 |
|
2023-10-10 21:50:02,850 epoch 5 - iter 112/146 - loss 0.25808288 - time (sec): 75.72 - samples/sec: 458.61 - lr: 0.000088 - momentum: 0.000000 |
|
2023-10-10 21:50:11,451 epoch 5 - iter 126/146 - loss 0.25610218 - time (sec): 84.32 - samples/sec: 458.55 - lr: 0.000086 - momentum: 0.000000 |
|
2023-10-10 21:50:20,211 epoch 5 - iter 140/146 - loss 0.25265262 - time (sec): 93.08 - samples/sec: 458.37 - lr: 0.000084 - momentum: 0.000000 |
|
2023-10-10 21:50:23,964 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:50:23,964 EPOCH 5 done: loss 0.2526 - lr: 0.000084 |
|
2023-10-10 21:50:29,941 DEV : loss 0.20196418464183807 - f1-score (micro avg) 0.4746 |
|
2023-10-10 21:50:29,950 saving best model |
|
2023-10-10 21:50:37,460 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:50:47,309 epoch 6 - iter 14/146 - loss 0.21125154 - time (sec): 9.84 - samples/sec: 508.80 - lr: 0.000082 - momentum: 0.000000 |
|
2023-10-10 21:50:56,084 epoch 6 - iter 28/146 - loss 0.22445195 - time (sec): 18.62 - samples/sec: 481.00 - lr: 0.000081 - momentum: 0.000000 |
|
2023-10-10 21:51:05,040 epoch 6 - iter 42/146 - loss 0.21454292 - time (sec): 27.58 - samples/sec: 476.11 - lr: 0.000079 - momentum: 0.000000 |
|
2023-10-10 21:51:13,750 epoch 6 - iter 56/146 - loss 0.20511075 - time (sec): 36.29 - samples/sec: 485.70 - lr: 0.000077 - momentum: 0.000000 |
|
2023-10-10 21:51:23,059 epoch 6 - iter 70/146 - loss 0.20516919 - time (sec): 45.59 - samples/sec: 488.85 - lr: 0.000076 - momentum: 0.000000 |
|
2023-10-10 21:51:31,377 epoch 6 - iter 84/146 - loss 0.20221448 - time (sec): 53.91 - samples/sec: 480.18 - lr: 0.000074 - momentum: 0.000000 |
|
2023-10-10 21:51:40,433 epoch 6 - iter 98/146 - loss 0.20584764 - time (sec): 62.97 - samples/sec: 474.79 - lr: 0.000073 - momentum: 0.000000 |
|
2023-10-10 21:51:49,118 epoch 6 - iter 112/146 - loss 0.19933841 - time (sec): 71.65 - samples/sec: 474.95 - lr: 0.000071 - momentum: 0.000000 |
|
2023-10-10 21:51:58,017 epoch 6 - iter 126/146 - loss 0.20436836 - time (sec): 80.55 - samples/sec: 480.15 - lr: 0.000069 - momentum: 0.000000 |
|
2023-10-10 21:52:05,954 epoch 6 - iter 140/146 - loss 0.20428475 - time (sec): 88.49 - samples/sec: 475.67 - lr: 0.000068 - momentum: 0.000000 |
|
2023-10-10 21:52:10,321 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:52:10,322 EPOCH 6 done: loss 0.2022 - lr: 0.000068 |
|
2023-10-10 21:52:16,082 DEV : loss 0.17842039465904236 - f1-score (micro avg) 0.5636 |
|
2023-10-10 21:52:16,091 saving best model |
|
2023-10-10 21:52:24,741 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:52:34,632 epoch 7 - iter 14/146 - loss 0.16519836 - time (sec): 9.89 - samples/sec: 484.44 - lr: 0.000066 - momentum: 0.000000 |
|
2023-10-10 21:52:43,455 epoch 7 - iter 28/146 - loss 0.15491464 - time (sec): 18.71 - samples/sec: 459.41 - lr: 0.000064 - momentum: 0.000000 |
|
2023-10-10 21:52:52,480 epoch 7 - iter 42/146 - loss 0.17736530 - time (sec): 27.74 - samples/sec: 460.60 - lr: 0.000062 - momentum: 0.000000 |
|
2023-10-10 21:53:01,686 epoch 7 - iter 56/146 - loss 0.16335559 - time (sec): 36.94 - samples/sec: 477.54 - lr: 0.000061 - momentum: 0.000000 |
|
2023-10-10 21:53:10,637 epoch 7 - iter 70/146 - loss 0.15852353 - time (sec): 45.89 - samples/sec: 477.37 - lr: 0.000059 - momentum: 0.000000 |
|
2023-10-10 21:53:18,935 epoch 7 - iter 84/146 - loss 0.16471171 - time (sec): 54.19 - samples/sec: 468.14 - lr: 0.000058 - momentum: 0.000000 |
|
2023-10-10 21:53:28,006 epoch 7 - iter 98/146 - loss 0.16359888 - time (sec): 63.26 - samples/sec: 467.46 - lr: 0.000056 - momentum: 0.000000 |
|
2023-10-10 21:53:37,417 epoch 7 - iter 112/146 - loss 0.15896974 - time (sec): 72.67 - samples/sec: 473.77 - lr: 0.000054 - momentum: 0.000000 |
|
2023-10-10 21:53:46,512 epoch 7 - iter 126/146 - loss 0.15746455 - time (sec): 81.77 - samples/sec: 474.51 - lr: 0.000053 - momentum: 0.000000 |
|
2023-10-10 21:53:56,111 epoch 7 - iter 140/146 - loss 0.16140715 - time (sec): 91.37 - samples/sec: 463.92 - lr: 0.000051 - momentum: 0.000000 |
|
2023-10-10 21:54:00,586 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:54:00,587 EPOCH 7 done: loss 0.1639 - lr: 0.000051 |
|
2023-10-10 21:54:07,600 DEV : loss 0.16299203038215637 - f1-score (micro avg) 0.6061 |
|
2023-10-10 21:54:07,610 saving best model |
|
2023-10-10 21:54:15,265 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:54:24,807 epoch 8 - iter 14/146 - loss 0.14054633 - time (sec): 9.54 - samples/sec: 515.83 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-10 21:54:33,707 epoch 8 - iter 28/146 - loss 0.15463910 - time (sec): 18.44 - samples/sec: 526.26 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-10 21:54:42,354 epoch 8 - iter 42/146 - loss 0.15471171 - time (sec): 27.08 - samples/sec: 512.50 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-10 21:54:50,606 epoch 8 - iter 56/146 - loss 0.14788860 - time (sec): 35.34 - samples/sec: 498.83 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-10 21:54:59,640 epoch 8 - iter 70/146 - loss 0.14632606 - time (sec): 44.37 - samples/sec: 498.54 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-10 21:55:08,163 epoch 8 - iter 84/146 - loss 0.15098798 - time (sec): 52.89 - samples/sec: 491.80 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-10 21:55:17,245 epoch 8 - iter 98/146 - loss 0.14595901 - time (sec): 61.98 - samples/sec: 483.64 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-10 21:55:27,185 epoch 8 - iter 112/146 - loss 0.14346752 - time (sec): 71.92 - samples/sec: 480.22 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-10 21:55:36,937 epoch 8 - iter 126/146 - loss 0.14157061 - time (sec): 81.67 - samples/sec: 471.85 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-10 21:55:46,970 epoch 8 - iter 140/146 - loss 0.13629684 - time (sec): 91.70 - samples/sec: 467.06 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-10 21:55:50,860 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:55:50,860 EPOCH 8 done: loss 0.1398 - lr: 0.000035 |
|
2023-10-10 21:55:56,778 DEV : loss 0.15665240585803986 - f1-score (micro avg) 0.6504 |
|
2023-10-10 21:55:56,788 saving best model |
|
2023-10-10 21:56:06,636 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:56:16,167 epoch 9 - iter 14/146 - loss 0.10274893 - time (sec): 9.53 - samples/sec: 441.91 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-10 21:56:26,757 epoch 9 - iter 28/146 - loss 0.13023131 - time (sec): 20.12 - samples/sec: 454.61 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-10 21:56:36,486 epoch 9 - iter 42/146 - loss 0.12947726 - time (sec): 29.85 - samples/sec: 444.52 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-10 21:56:46,259 epoch 9 - iter 56/146 - loss 0.12059411 - time (sec): 39.62 - samples/sec: 442.36 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-10 21:56:55,148 epoch 9 - iter 70/146 - loss 0.12012299 - time (sec): 48.51 - samples/sec: 450.59 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-10 21:57:03,385 epoch 9 - iter 84/146 - loss 0.12128595 - time (sec): 56.74 - samples/sec: 456.05 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-10 21:57:12,121 epoch 9 - iter 98/146 - loss 0.12559222 - time (sec): 65.48 - samples/sec: 461.22 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-10 21:57:20,492 epoch 9 - iter 112/146 - loss 0.12343906 - time (sec): 73.85 - samples/sec: 465.07 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-10 21:57:29,518 epoch 9 - iter 126/146 - loss 0.12193938 - time (sec): 82.88 - samples/sec: 472.03 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-10 21:57:37,883 epoch 9 - iter 140/146 - loss 0.12379814 - time (sec): 91.24 - samples/sec: 470.00 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-10 21:57:41,307 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:57:41,307 EPOCH 9 done: loss 0.1234 - lr: 0.000018 |
|
2023-10-10 21:57:47,299 DEV : loss 0.1573924571275711 - f1-score (micro avg) 0.6348 |
|
2023-10-10 21:57:47,308 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:57:56,128 epoch 10 - iter 14/146 - loss 0.10385368 - time (sec): 8.82 - samples/sec: 508.61 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-10 21:58:04,303 epoch 10 - iter 28/146 - loss 0.12007872 - time (sec): 16.99 - samples/sec: 479.61 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-10 21:58:12,720 epoch 10 - iter 42/146 - loss 0.11224087 - time (sec): 25.41 - samples/sec: 485.72 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-10 21:58:22,256 epoch 10 - iter 56/146 - loss 0.10217048 - time (sec): 34.95 - samples/sec: 503.12 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-10 21:58:31,488 epoch 10 - iter 70/146 - loss 0.10189941 - time (sec): 44.18 - samples/sec: 506.27 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-10 21:58:39,895 epoch 10 - iter 84/146 - loss 0.10134392 - time (sec): 52.59 - samples/sec: 499.85 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-10 21:58:48,328 epoch 10 - iter 98/146 - loss 0.10206351 - time (sec): 61.02 - samples/sec: 498.54 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-10 21:58:57,115 epoch 10 - iter 112/146 - loss 0.10614914 - time (sec): 69.81 - samples/sec: 497.17 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-10 21:59:05,599 epoch 10 - iter 126/146 - loss 0.10946359 - time (sec): 78.29 - samples/sec: 494.50 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-10 21:59:14,323 epoch 10 - iter 140/146 - loss 0.11293547 - time (sec): 87.01 - samples/sec: 494.56 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-10 21:59:17,682 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:59:17,683 EPOCH 10 done: loss 0.1160 - lr: 0.000001 |
|
2023-10-10 21:59:23,725 DEV : loss 0.15526741743087769 - f1-score (micro avg) 0.6783 |
|
2023-10-10 21:59:23,734 saving best model |
|
2023-10-10 21:59:32,858 ---------------------------------------------------------------------------------------------------- |
|
2023-10-10 21:59:32,860 Loading model from best epoch ... |
|
2023-10-10 21:59:38,272 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-10 21:59:51,377 |
|
Results: |
|
- F-score (micro) 0.7005 |
|
- F-score (macro) 0.6111 |
|
- Accuracy 0.5721 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
PER 0.7970 0.7672 0.7818 348 |
|
LOC 0.5960 0.7969 0.6820 261 |
|
ORG 0.2857 0.3077 0.2963 52 |
|
HumanProd 0.8125 0.5909 0.6842 22 |
|
|
|
micro avg 0.6667 0.7379 0.7005 683 |
|
macro avg 0.6228 0.6157 0.6111 683 |
|
weighted avg 0.6818 0.7379 0.7036 683 |
|
|
|
2023-10-10 21:59:51,378 ---------------------------------------------------------------------------------------------------- |
|
|