|
2023-10-17 13:24:18,291 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:24:18,292 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): ElectraModel( |
|
(embeddings): ElectraEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): ElectraEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x ElectraLayer( |
|
(attention): ElectraAttention( |
|
(self): ElectraSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): ElectraSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): ElectraIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): ElectraOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-17 13:24:18,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:24:18,292 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-17 13:24:18,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:24:18,292 Train: 7142 sentences |
|
2023-10-17 13:24:18,292 (train_with_dev=False, train_with_test=False) |
|
2023-10-17 13:24:18,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:24:18,292 Training Params: |
|
2023-10-17 13:24:18,292 - learning_rate: "5e-05" |
|
2023-10-17 13:24:18,292 - mini_batch_size: "8" |
|
2023-10-17 13:24:18,292 - max_epochs: "10" |
|
2023-10-17 13:24:18,292 - shuffle: "True" |
|
2023-10-17 13:24:18,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:24:18,292 Plugins: |
|
2023-10-17 13:24:18,292 - TensorboardLogger |
|
2023-10-17 13:24:18,292 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-17 13:24:18,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:24:18,292 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-17 13:24:18,292 - metric: "('micro avg', 'f1-score')" |
|
2023-10-17 13:24:18,292 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:24:18,292 Computation: |
|
2023-10-17 13:24:18,292 - compute on device: cuda:0 |
|
2023-10-17 13:24:18,293 - embedding storage: none |
|
2023-10-17 13:24:18,293 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:24:18,293 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-17 13:24:18,293 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:24:18,293 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:24:18,293 Logging anything other than scalars to TensorBoard is currently not supported. |
|
2023-10-17 13:24:24,918 epoch 1 - iter 89/893 - loss 2.75866120 - time (sec): 6.62 - samples/sec: 3473.86 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 13:24:32,428 epoch 1 - iter 178/893 - loss 1.58176161 - time (sec): 14.13 - samples/sec: 3513.32 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 13:24:39,299 epoch 1 - iter 267/893 - loss 1.18997196 - time (sec): 21.01 - samples/sec: 3548.14 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 13:24:46,281 epoch 1 - iter 356/893 - loss 0.95916455 - time (sec): 27.99 - samples/sec: 3612.14 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 13:24:52,902 epoch 1 - iter 445/893 - loss 0.81844743 - time (sec): 34.61 - samples/sec: 3606.55 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 13:24:59,404 epoch 1 - iter 534/893 - loss 0.72551967 - time (sec): 41.11 - samples/sec: 3603.88 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 13:25:06,164 epoch 1 - iter 623/893 - loss 0.64807688 - time (sec): 47.87 - samples/sec: 3609.93 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 13:25:13,480 epoch 1 - iter 712/893 - loss 0.58175950 - time (sec): 55.19 - samples/sec: 3610.39 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 13:25:20,051 epoch 1 - iter 801/893 - loss 0.53640711 - time (sec): 61.76 - samples/sec: 3608.51 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 13:25:26,675 epoch 1 - iter 890/893 - loss 0.49591919 - time (sec): 68.38 - samples/sec: 3629.34 - lr: 0.000050 - momentum: 0.000000 |
|
2023-10-17 13:25:26,847 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:25:26,847 EPOCH 1 done: loss 0.4953 - lr: 0.000050 |
|
2023-10-17 13:25:30,025 DEV : loss 0.13788369297981262 - f1-score (micro avg) 0.7176 |
|
2023-10-17 13:25:30,041 saving best model |
|
2023-10-17 13:25:30,417 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:25:37,070 epoch 2 - iter 89/893 - loss 0.12012810 - time (sec): 6.65 - samples/sec: 3655.67 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 13:25:43,766 epoch 2 - iter 178/893 - loss 0.11907211 - time (sec): 13.35 - samples/sec: 3598.83 - lr: 0.000049 - momentum: 0.000000 |
|
2023-10-17 13:25:50,441 epoch 2 - iter 267/893 - loss 0.11594867 - time (sec): 20.02 - samples/sec: 3490.94 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 13:25:57,844 epoch 2 - iter 356/893 - loss 0.11429689 - time (sec): 27.43 - samples/sec: 3466.61 - lr: 0.000048 - momentum: 0.000000 |
|
2023-10-17 13:26:04,965 epoch 2 - iter 445/893 - loss 0.11235555 - time (sec): 34.55 - samples/sec: 3515.49 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 13:26:12,139 epoch 2 - iter 534/893 - loss 0.11174439 - time (sec): 41.72 - samples/sec: 3532.14 - lr: 0.000047 - momentum: 0.000000 |
|
2023-10-17 13:26:19,555 epoch 2 - iter 623/893 - loss 0.10874895 - time (sec): 49.14 - samples/sec: 3559.42 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 13:26:26,357 epoch 2 - iter 712/893 - loss 0.11039562 - time (sec): 55.94 - samples/sec: 3574.71 - lr: 0.000046 - momentum: 0.000000 |
|
2023-10-17 13:26:33,252 epoch 2 - iter 801/893 - loss 0.10946145 - time (sec): 62.83 - samples/sec: 3571.72 - lr: 0.000045 - momentum: 0.000000 |
|
2023-10-17 13:26:40,028 epoch 2 - iter 890/893 - loss 0.10884277 - time (sec): 69.61 - samples/sec: 3564.35 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 13:26:40,230 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:26:40,230 EPOCH 2 done: loss 0.1088 - lr: 0.000044 |
|
2023-10-17 13:26:44,981 DEV : loss 0.11145459860563278 - f1-score (micro avg) 0.7591 |
|
2023-10-17 13:26:44,997 saving best model |
|
2023-10-17 13:26:45,466 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:26:52,240 epoch 3 - iter 89/893 - loss 0.06579565 - time (sec): 6.77 - samples/sec: 3623.69 - lr: 0.000044 - momentum: 0.000000 |
|
2023-10-17 13:26:59,465 epoch 3 - iter 178/893 - loss 0.06780791 - time (sec): 14.00 - samples/sec: 3529.80 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 13:27:06,519 epoch 3 - iter 267/893 - loss 0.06511173 - time (sec): 21.05 - samples/sec: 3546.04 - lr: 0.000043 - momentum: 0.000000 |
|
2023-10-17 13:27:13,320 epoch 3 - iter 356/893 - loss 0.06708858 - time (sec): 27.85 - samples/sec: 3533.37 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 13:27:20,468 epoch 3 - iter 445/893 - loss 0.06704685 - time (sec): 35.00 - samples/sec: 3543.72 - lr: 0.000042 - momentum: 0.000000 |
|
2023-10-17 13:27:27,155 epoch 3 - iter 534/893 - loss 0.06913881 - time (sec): 41.69 - samples/sec: 3549.64 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 13:27:33,910 epoch 3 - iter 623/893 - loss 0.07009245 - time (sec): 48.44 - samples/sec: 3557.20 - lr: 0.000041 - momentum: 0.000000 |
|
2023-10-17 13:27:41,521 epoch 3 - iter 712/893 - loss 0.07061403 - time (sec): 56.05 - samples/sec: 3538.67 - lr: 0.000040 - momentum: 0.000000 |
|
2023-10-17 13:27:48,514 epoch 3 - iter 801/893 - loss 0.07172932 - time (sec): 63.05 - samples/sec: 3535.47 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 13:27:55,514 epoch 3 - iter 890/893 - loss 0.07158095 - time (sec): 70.05 - samples/sec: 3535.31 - lr: 0.000039 - momentum: 0.000000 |
|
2023-10-17 13:27:55,798 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:27:55,798 EPOCH 3 done: loss 0.0718 - lr: 0.000039 |
|
2023-10-17 13:27:59,930 DEV : loss 0.11546944081783295 - f1-score (micro avg) 0.8022 |
|
2023-10-17 13:27:59,947 saving best model |
|
2023-10-17 13:28:00,357 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:28:07,372 epoch 4 - iter 89/893 - loss 0.04412197 - time (sec): 7.01 - samples/sec: 3686.05 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 13:28:14,225 epoch 4 - iter 178/893 - loss 0.04610454 - time (sec): 13.87 - samples/sec: 3640.62 - lr: 0.000038 - momentum: 0.000000 |
|
2023-10-17 13:28:21,074 epoch 4 - iter 267/893 - loss 0.04657536 - time (sec): 20.72 - samples/sec: 3671.19 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 13:28:28,332 epoch 4 - iter 356/893 - loss 0.04801474 - time (sec): 27.97 - samples/sec: 3645.53 - lr: 0.000037 - momentum: 0.000000 |
|
2023-10-17 13:28:35,026 epoch 4 - iter 445/893 - loss 0.04910268 - time (sec): 34.67 - samples/sec: 3628.37 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 13:28:42,276 epoch 4 - iter 534/893 - loss 0.04764748 - time (sec): 41.92 - samples/sec: 3606.03 - lr: 0.000036 - momentum: 0.000000 |
|
2023-10-17 13:28:49,159 epoch 4 - iter 623/893 - loss 0.04869073 - time (sec): 48.80 - samples/sec: 3610.17 - lr: 0.000035 - momentum: 0.000000 |
|
2023-10-17 13:28:55,552 epoch 4 - iter 712/893 - loss 0.04835033 - time (sec): 55.19 - samples/sec: 3619.42 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 13:29:02,534 epoch 4 - iter 801/893 - loss 0.04802865 - time (sec): 62.18 - samples/sec: 3603.83 - lr: 0.000034 - momentum: 0.000000 |
|
2023-10-17 13:29:09,448 epoch 4 - iter 890/893 - loss 0.04862439 - time (sec): 69.09 - samples/sec: 3586.93 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 13:29:09,658 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:29:09,658 EPOCH 4 done: loss 0.0486 - lr: 0.000033 |
|
2023-10-17 13:29:14,333 DEV : loss 0.14231492578983307 - f1-score (micro avg) 0.7979 |
|
2023-10-17 13:29:14,349 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:29:21,265 epoch 5 - iter 89/893 - loss 0.02919832 - time (sec): 6.92 - samples/sec: 3587.21 - lr: 0.000033 - momentum: 0.000000 |
|
2023-10-17 13:29:28,578 epoch 5 - iter 178/893 - loss 0.03380540 - time (sec): 14.23 - samples/sec: 3596.81 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 13:29:35,749 epoch 5 - iter 267/893 - loss 0.03673622 - time (sec): 21.40 - samples/sec: 3588.93 - lr: 0.000032 - momentum: 0.000000 |
|
2023-10-17 13:29:42,735 epoch 5 - iter 356/893 - loss 0.03585695 - time (sec): 28.39 - samples/sec: 3571.78 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 13:29:49,409 epoch 5 - iter 445/893 - loss 0.03553433 - time (sec): 35.06 - samples/sec: 3555.51 - lr: 0.000031 - momentum: 0.000000 |
|
2023-10-17 13:29:56,393 epoch 5 - iter 534/893 - loss 0.03588921 - time (sec): 42.04 - samples/sec: 3566.40 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-17 13:30:03,143 epoch 5 - iter 623/893 - loss 0.03598839 - time (sec): 48.79 - samples/sec: 3568.67 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 13:30:09,807 epoch 5 - iter 712/893 - loss 0.03692474 - time (sec): 55.46 - samples/sec: 3586.34 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-17 13:30:16,339 epoch 5 - iter 801/893 - loss 0.03697731 - time (sec): 61.99 - samples/sec: 3570.11 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 13:30:23,926 epoch 5 - iter 890/893 - loss 0.03792902 - time (sec): 69.58 - samples/sec: 3562.29 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-17 13:30:24,182 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:30:24,182 EPOCH 5 done: loss 0.0378 - lr: 0.000028 |
|
2023-10-17 13:30:28,303 DEV : loss 0.1556406468153 - f1-score (micro avg) 0.7992 |
|
2023-10-17 13:30:28,320 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:30:35,522 epoch 6 - iter 89/893 - loss 0.03244423 - time (sec): 7.20 - samples/sec: 3480.52 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 13:30:42,825 epoch 6 - iter 178/893 - loss 0.02917985 - time (sec): 14.50 - samples/sec: 3535.82 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-17 13:30:49,786 epoch 6 - iter 267/893 - loss 0.02840967 - time (sec): 21.47 - samples/sec: 3499.22 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 13:30:56,763 epoch 6 - iter 356/893 - loss 0.02860021 - time (sec): 28.44 - samples/sec: 3506.93 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-17 13:31:03,699 epoch 6 - iter 445/893 - loss 0.02825516 - time (sec): 35.38 - samples/sec: 3497.03 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-17 13:31:10,471 epoch 6 - iter 534/893 - loss 0.02743104 - time (sec): 42.15 - samples/sec: 3508.88 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 13:31:17,404 epoch 6 - iter 623/893 - loss 0.02858377 - time (sec): 49.08 - samples/sec: 3519.50 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-17 13:31:24,472 epoch 6 - iter 712/893 - loss 0.02891426 - time (sec): 56.15 - samples/sec: 3536.30 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 13:31:31,162 epoch 6 - iter 801/893 - loss 0.02921549 - time (sec): 62.84 - samples/sec: 3545.04 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-17 13:31:37,875 epoch 6 - iter 890/893 - loss 0.02861389 - time (sec): 69.55 - samples/sec: 3563.39 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 13:31:38,099 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:31:38,099 EPOCH 6 done: loss 0.0286 - lr: 0.000022 |
|
2023-10-17 13:31:42,231 DEV : loss 0.19660857319831848 - f1-score (micro avg) 0.8014 |
|
2023-10-17 13:31:42,248 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:31:49,341 epoch 7 - iter 89/893 - loss 0.01678255 - time (sec): 7.09 - samples/sec: 3621.51 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-17 13:31:56,211 epoch 7 - iter 178/893 - loss 0.01553467 - time (sec): 13.96 - samples/sec: 3639.54 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 13:32:02,722 epoch 7 - iter 267/893 - loss 0.01925221 - time (sec): 20.47 - samples/sec: 3658.02 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-17 13:32:09,570 epoch 7 - iter 356/893 - loss 0.02042759 - time (sec): 27.32 - samples/sec: 3676.51 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-17 13:32:16,679 epoch 7 - iter 445/893 - loss 0.02070852 - time (sec): 34.43 - samples/sec: 3652.01 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 13:32:23,162 epoch 7 - iter 534/893 - loss 0.02114971 - time (sec): 40.91 - samples/sec: 3654.12 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-17 13:32:30,075 epoch 7 - iter 623/893 - loss 0.02086037 - time (sec): 47.83 - samples/sec: 3667.87 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 13:32:36,985 epoch 7 - iter 712/893 - loss 0.02012071 - time (sec): 54.74 - samples/sec: 3654.83 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-17 13:32:43,800 epoch 7 - iter 801/893 - loss 0.02038745 - time (sec): 61.55 - samples/sec: 3617.12 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 13:32:50,696 epoch 7 - iter 890/893 - loss 0.02046484 - time (sec): 68.45 - samples/sec: 3615.30 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-17 13:32:50,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:32:50,930 EPOCH 7 done: loss 0.0204 - lr: 0.000017 |
|
2023-10-17 13:32:55,592 DEV : loss 0.2123110592365265 - f1-score (micro avg) 0.8067 |
|
2023-10-17 13:32:55,610 saving best model |
|
2023-10-17 13:32:56,074 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:33:03,283 epoch 8 - iter 89/893 - loss 0.01777584 - time (sec): 7.21 - samples/sec: 3367.87 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 13:33:10,120 epoch 8 - iter 178/893 - loss 0.01370706 - time (sec): 14.04 - samples/sec: 3476.86 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-17 13:33:16,860 epoch 8 - iter 267/893 - loss 0.01393815 - time (sec): 20.78 - samples/sec: 3522.88 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-17 13:33:23,827 epoch 8 - iter 356/893 - loss 0.01526503 - time (sec): 27.75 - samples/sec: 3514.42 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 13:33:30,667 epoch 8 - iter 445/893 - loss 0.01611125 - time (sec): 34.59 - samples/sec: 3537.20 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-17 13:33:37,620 epoch 8 - iter 534/893 - loss 0.01582779 - time (sec): 41.54 - samples/sec: 3521.50 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 13:33:44,461 epoch 8 - iter 623/893 - loss 0.01549864 - time (sec): 48.38 - samples/sec: 3546.15 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-17 13:33:52,073 epoch 8 - iter 712/893 - loss 0.01573317 - time (sec): 56.00 - samples/sec: 3543.71 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 13:33:58,734 epoch 8 - iter 801/893 - loss 0.01601116 - time (sec): 62.66 - samples/sec: 3552.01 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-17 13:34:05,638 epoch 8 - iter 890/893 - loss 0.01543464 - time (sec): 69.56 - samples/sec: 3567.03 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 13:34:05,852 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:34:05,852 EPOCH 8 done: loss 0.0155 - lr: 0.000011 |
|
2023-10-17 13:34:10,003 DEV : loss 0.22176583111286163 - f1-score (micro avg) 0.8003 |
|
2023-10-17 13:34:10,021 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:34:18,103 epoch 9 - iter 89/893 - loss 0.01083963 - time (sec): 8.08 - samples/sec: 3159.68 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-17 13:34:25,070 epoch 9 - iter 178/893 - loss 0.00956570 - time (sec): 15.05 - samples/sec: 3388.71 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-17 13:34:31,928 epoch 9 - iter 267/893 - loss 0.01138130 - time (sec): 21.91 - samples/sec: 3434.27 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 13:34:38,528 epoch 9 - iter 356/893 - loss 0.01058467 - time (sec): 28.51 - samples/sec: 3469.81 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-17 13:34:45,321 epoch 9 - iter 445/893 - loss 0.00921370 - time (sec): 35.30 - samples/sec: 3500.48 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 13:34:51,981 epoch 9 - iter 534/893 - loss 0.00953823 - time (sec): 41.96 - samples/sec: 3527.17 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-17 13:34:58,609 epoch 9 - iter 623/893 - loss 0.00962444 - time (sec): 48.59 - samples/sec: 3523.06 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 13:35:05,539 epoch 9 - iter 712/893 - loss 0.01033271 - time (sec): 55.52 - samples/sec: 3533.02 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-17 13:35:12,598 epoch 9 - iter 801/893 - loss 0.01059466 - time (sec): 62.58 - samples/sec: 3540.07 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 13:35:19,637 epoch 9 - iter 890/893 - loss 0.01016129 - time (sec): 69.61 - samples/sec: 3565.62 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-17 13:35:19,813 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:35:19,813 EPOCH 9 done: loss 0.0101 - lr: 0.000006 |
|
2023-10-17 13:35:23,967 DEV : loss 0.2231239527463913 - f1-score (micro avg) 0.8124 |
|
2023-10-17 13:35:23,985 saving best model |
|
2023-10-17 13:35:24,449 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:35:31,551 epoch 10 - iter 89/893 - loss 0.00425626 - time (sec): 7.10 - samples/sec: 3586.43 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-17 13:35:38,709 epoch 10 - iter 178/893 - loss 0.00582628 - time (sec): 14.25 - samples/sec: 3585.95 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 13:35:45,652 epoch 10 - iter 267/893 - loss 0.00793966 - time (sec): 21.20 - samples/sec: 3592.75 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-17 13:35:52,247 epoch 10 - iter 356/893 - loss 0.00702058 - time (sec): 27.79 - samples/sec: 3578.58 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 13:35:59,564 epoch 10 - iter 445/893 - loss 0.00689434 - time (sec): 35.11 - samples/sec: 3565.57 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-17 13:36:06,713 epoch 10 - iter 534/893 - loss 0.00633107 - time (sec): 42.26 - samples/sec: 3543.91 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 13:36:13,298 epoch 10 - iter 623/893 - loss 0.00613990 - time (sec): 48.84 - samples/sec: 3557.23 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-17 13:36:20,364 epoch 10 - iter 712/893 - loss 0.00603713 - time (sec): 55.91 - samples/sec: 3563.05 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 13:36:27,000 epoch 10 - iter 801/893 - loss 0.00645980 - time (sec): 62.55 - samples/sec: 3571.76 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-17 13:36:34,138 epoch 10 - iter 890/893 - loss 0.00649064 - time (sec): 69.68 - samples/sec: 3557.52 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-17 13:36:34,324 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:36:34,324 EPOCH 10 done: loss 0.0065 - lr: 0.000000 |
|
2023-10-17 13:36:39,073 DEV : loss 0.23280413448810577 - f1-score (micro avg) 0.8165 |
|
2023-10-17 13:36:39,090 saving best model |
|
2023-10-17 13:36:39,905 ---------------------------------------------------------------------------------------------------- |
|
2023-10-17 13:36:39,907 Loading model from best epoch ... |
|
2023-10-17 13:36:41,238 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-17 13:36:50,854 |
|
Results: |
|
- F-score (micro) 0.6959 |
|
- F-score (macro) 0.6247 |
|
- Accuracy 0.5492 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6964 0.6995 0.6979 1095 |
|
PER 0.7849 0.7826 0.7838 1012 |
|
ORG 0.4352 0.5742 0.4952 357 |
|
HumanProd 0.4068 0.7273 0.5217 33 |
|
|
|
micro avg 0.6772 0.7157 0.6959 2497 |
|
macro avg 0.5808 0.6959 0.6247 2497 |
|
weighted avg 0.6911 0.7157 0.7014 2497 |
|
|
|
2023-10-17 13:36:50,854 ---------------------------------------------------------------------------------------------------- |
|
|