Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697668532.46dc0c540dd0.3571.9 +3 -0
- test.tsv +0 -0
- training.log +245 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e0cd5e1bb4d479461a2dbf93f5953c1fb3ad4d947e4310110c653929a7522588
|
3 |
+
size 19045922
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 22:35:58 0.0000 0.7706 0.2561 0.8655 0.1064 0.1895 0.1047
|
3 |
+
2 22:36:24 0.0000 0.1995 0.2152 0.6696 0.2386 0.3519 0.2165
|
4 |
+
3 22:36:50 0.0000 0.1663 0.2010 0.6680 0.3430 0.4532 0.2978
|
5 |
+
4 22:37:16 0.0000 0.1487 0.1871 0.6536 0.4308 0.5193 0.3617
|
6 |
+
5 22:37:42 0.0000 0.1353 0.1946 0.6546 0.4680 0.5458 0.3852
|
7 |
+
6 22:38:07 0.0000 0.1276 0.1798 0.6068 0.5196 0.5598 0.4024
|
8 |
+
7 22:38:33 0.0000 0.1195 0.1770 0.6476 0.5372 0.5872 0.4276
|
9 |
+
8 22:38:59 0.0000 0.1134 0.1852 0.6528 0.5207 0.5793 0.4204
|
10 |
+
9 22:39:25 0.0000 0.1097 0.1854 0.6613 0.5465 0.5984 0.4408
|
11 |
+
10 22:39:51 0.0000 0.1055 0.1872 0.6571 0.5444 0.5955 0.4381
|
runs/events.out.tfevents.1697668532.46dc0c540dd0.3571.9
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:40b645bd6a11d85be2a55cc0838521e356ae0f437e556129f171ddf4e2940451
|
3 |
+
size 808480
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,245 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-18 22:35:32,803 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-18 22:35:32,803 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 128)
|
7 |
+
(position_embeddings): Embedding(512, 128)
|
8 |
+
(token_type_embeddings): Embedding(2, 128)
|
9 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-1): 2 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=128, out_features=128, bias=True)
|
18 |
+
(key): Linear(in_features=128, out_features=128, bias=True)
|
19 |
+
(value): Linear(in_features=128, out_features=128, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=128, out_features=512, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=512, out_features=128, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=128, out_features=13, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-18 22:35:32,803 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-18 22:35:32,803 MultiCorpus: 5777 train + 722 dev + 723 test sentences
|
52 |
+
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
|
53 |
+
2023-10-18 22:35:32,803 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-18 22:35:32,803 Train: 5777 sentences
|
55 |
+
2023-10-18 22:35:32,803 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-18 22:35:32,803 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-18 22:35:32,803 Training Params:
|
58 |
+
2023-10-18 22:35:32,803 - learning_rate: "5e-05"
|
59 |
+
2023-10-18 22:35:32,803 - mini_batch_size: "4"
|
60 |
+
2023-10-18 22:35:32,804 - max_epochs: "10"
|
61 |
+
2023-10-18 22:35:32,804 - shuffle: "True"
|
62 |
+
2023-10-18 22:35:32,804 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-18 22:35:32,804 Plugins:
|
64 |
+
2023-10-18 22:35:32,804 - TensorboardLogger
|
65 |
+
2023-10-18 22:35:32,804 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-18 22:35:32,804 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-18 22:35:32,804 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-18 22:35:32,804 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-18 22:35:32,804 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-18 22:35:32,804 Computation:
|
71 |
+
2023-10-18 22:35:32,804 - compute on device: cuda:0
|
72 |
+
2023-10-18 22:35:32,804 - embedding storage: none
|
73 |
+
2023-10-18 22:35:32,804 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-18 22:35:32,804 Model training base path: "hmbench-icdar/nl-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
|
75 |
+
2023-10-18 22:35:32,804 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-18 22:35:32,804 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-18 22:35:32,804 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-18 22:35:35,293 epoch 1 - iter 144/1445 - loss 2.86625285 - time (sec): 2.49 - samples/sec: 7167.51 - lr: 0.000005 - momentum: 0.000000
|
79 |
+
2023-10-18 22:35:37,764 epoch 1 - iter 288/1445 - loss 2.39182936 - time (sec): 4.96 - samples/sec: 7294.65 - lr: 0.000010 - momentum: 0.000000
|
80 |
+
2023-10-18 22:35:40,191 epoch 1 - iter 432/1445 - loss 1.86271468 - time (sec): 7.39 - samples/sec: 7225.61 - lr: 0.000015 - momentum: 0.000000
|
81 |
+
2023-10-18 22:35:42,586 epoch 1 - iter 576/1445 - loss 1.48735779 - time (sec): 9.78 - samples/sec: 7273.86 - lr: 0.000020 - momentum: 0.000000
|
82 |
+
2023-10-18 22:35:44,941 epoch 1 - iter 720/1445 - loss 1.24724928 - time (sec): 12.14 - samples/sec: 7369.69 - lr: 0.000025 - momentum: 0.000000
|
83 |
+
2023-10-18 22:35:47,348 epoch 1 - iter 864/1445 - loss 1.09616008 - time (sec): 14.54 - samples/sec: 7359.26 - lr: 0.000030 - momentum: 0.000000
|
84 |
+
2023-10-18 22:35:49,769 epoch 1 - iter 1008/1445 - loss 0.97810336 - time (sec): 16.96 - samples/sec: 7381.19 - lr: 0.000035 - momentum: 0.000000
|
85 |
+
2023-10-18 22:35:52,205 epoch 1 - iter 1152/1445 - loss 0.89270314 - time (sec): 19.40 - samples/sec: 7340.76 - lr: 0.000040 - momentum: 0.000000
|
86 |
+
2023-10-18 22:35:54,547 epoch 1 - iter 1296/1445 - loss 0.82673392 - time (sec): 21.74 - samples/sec: 7317.91 - lr: 0.000045 - momentum: 0.000000
|
87 |
+
2023-10-18 22:35:56,894 epoch 1 - iter 1440/1445 - loss 0.77159928 - time (sec): 24.09 - samples/sec: 7297.89 - lr: 0.000050 - momentum: 0.000000
|
88 |
+
2023-10-18 22:35:56,976 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-18 22:35:56,976 EPOCH 1 done: loss 0.7706 - lr: 0.000050
|
90 |
+
2023-10-18 22:35:58,210 DEV : loss 0.2561441659927368 - f1-score (micro avg) 0.1895
|
91 |
+
2023-10-18 22:35:58,224 saving best model
|
92 |
+
2023-10-18 22:35:58,255 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-18 22:36:00,645 epoch 2 - iter 144/1445 - loss 0.20788149 - time (sec): 2.39 - samples/sec: 7083.52 - lr: 0.000049 - momentum: 0.000000
|
94 |
+
2023-10-18 22:36:03,009 epoch 2 - iter 288/1445 - loss 0.21650715 - time (sec): 4.75 - samples/sec: 7262.68 - lr: 0.000049 - momentum: 0.000000
|
95 |
+
2023-10-18 22:36:05,512 epoch 2 - iter 432/1445 - loss 0.20552295 - time (sec): 7.26 - samples/sec: 7238.70 - lr: 0.000048 - momentum: 0.000000
|
96 |
+
2023-10-18 22:36:07,838 epoch 2 - iter 576/1445 - loss 0.20926347 - time (sec): 9.58 - samples/sec: 7233.85 - lr: 0.000048 - momentum: 0.000000
|
97 |
+
2023-10-18 22:36:10,169 epoch 2 - iter 720/1445 - loss 0.20584467 - time (sec): 11.91 - samples/sec: 7258.79 - lr: 0.000047 - momentum: 0.000000
|
98 |
+
2023-10-18 22:36:12,625 epoch 2 - iter 864/1445 - loss 0.20048058 - time (sec): 14.37 - samples/sec: 7265.04 - lr: 0.000047 - momentum: 0.000000
|
99 |
+
2023-10-18 22:36:15,104 epoch 2 - iter 1008/1445 - loss 0.20118581 - time (sec): 16.85 - samples/sec: 7250.86 - lr: 0.000046 - momentum: 0.000000
|
100 |
+
2023-10-18 22:36:17,549 epoch 2 - iter 1152/1445 - loss 0.19964640 - time (sec): 19.29 - samples/sec: 7315.99 - lr: 0.000046 - momentum: 0.000000
|
101 |
+
2023-10-18 22:36:19,981 epoch 2 - iter 1296/1445 - loss 0.19527663 - time (sec): 21.73 - samples/sec: 7309.63 - lr: 0.000045 - momentum: 0.000000
|
102 |
+
2023-10-18 22:36:22,379 epoch 2 - iter 1440/1445 - loss 0.19935605 - time (sec): 24.12 - samples/sec: 7281.82 - lr: 0.000044 - momentum: 0.000000
|
103 |
+
2023-10-18 22:36:22,457 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-18 22:36:22,458 EPOCH 2 done: loss 0.1995 - lr: 0.000044
|
105 |
+
2023-10-18 22:36:24,553 DEV : loss 0.2152344286441803 - f1-score (micro avg) 0.3519
|
106 |
+
2023-10-18 22:36:24,567 saving best model
|
107 |
+
2023-10-18 22:36:24,602 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-18 22:36:26,941 epoch 3 - iter 144/1445 - loss 0.18735705 - time (sec): 2.34 - samples/sec: 7338.15 - lr: 0.000044 - momentum: 0.000000
|
109 |
+
2023-10-18 22:36:29,358 epoch 3 - iter 288/1445 - loss 0.16728570 - time (sec): 4.76 - samples/sec: 7410.31 - lr: 0.000043 - momentum: 0.000000
|
110 |
+
2023-10-18 22:36:31,777 epoch 3 - iter 432/1445 - loss 0.16394077 - time (sec): 7.17 - samples/sec: 7362.02 - lr: 0.000043 - momentum: 0.000000
|
111 |
+
2023-10-18 22:36:34,039 epoch 3 - iter 576/1445 - loss 0.16749371 - time (sec): 9.44 - samples/sec: 7493.93 - lr: 0.000042 - momentum: 0.000000
|
112 |
+
2023-10-18 22:36:36,322 epoch 3 - iter 720/1445 - loss 0.16886016 - time (sec): 11.72 - samples/sec: 7376.05 - lr: 0.000042 - momentum: 0.000000
|
113 |
+
2023-10-18 22:36:38,727 epoch 3 - iter 864/1445 - loss 0.16886698 - time (sec): 14.12 - samples/sec: 7388.63 - lr: 0.000041 - momentum: 0.000000
|
114 |
+
2023-10-18 22:36:41,158 epoch 3 - iter 1008/1445 - loss 0.16881585 - time (sec): 16.55 - samples/sec: 7410.17 - lr: 0.000041 - momentum: 0.000000
|
115 |
+
2023-10-18 22:36:43,478 epoch 3 - iter 1152/1445 - loss 0.16991782 - time (sec): 18.87 - samples/sec: 7372.83 - lr: 0.000040 - momentum: 0.000000
|
116 |
+
2023-10-18 22:36:45,934 epoch 3 - iter 1296/1445 - loss 0.17156863 - time (sec): 21.33 - samples/sec: 7402.98 - lr: 0.000039 - momentum: 0.000000
|
117 |
+
2023-10-18 22:36:48,481 epoch 3 - iter 1440/1445 - loss 0.16631620 - time (sec): 23.88 - samples/sec: 7358.61 - lr: 0.000039 - momentum: 0.000000
|
118 |
+
2023-10-18 22:36:48,556 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-18 22:36:48,557 EPOCH 3 done: loss 0.1663 - lr: 0.000039
|
120 |
+
2023-10-18 22:36:50,325 DEV : loss 0.2009856253862381 - f1-score (micro avg) 0.4532
|
121 |
+
2023-10-18 22:36:50,339 saving best model
|
122 |
+
2023-10-18 22:36:50,375 ----------------------------------------------------------------------------------------------------
|
123 |
+
2023-10-18 22:36:52,749 epoch 4 - iter 144/1445 - loss 0.15362789 - time (sec): 2.37 - samples/sec: 7481.75 - lr: 0.000038 - momentum: 0.000000
|
124 |
+
2023-10-18 22:36:55,135 epoch 4 - iter 288/1445 - loss 0.15463328 - time (sec): 4.76 - samples/sec: 7160.04 - lr: 0.000038 - momentum: 0.000000
|
125 |
+
2023-10-18 22:36:57,639 epoch 4 - iter 432/1445 - loss 0.15709099 - time (sec): 7.26 - samples/sec: 7172.16 - lr: 0.000037 - momentum: 0.000000
|
126 |
+
2023-10-18 22:37:00,051 epoch 4 - iter 576/1445 - loss 0.14884886 - time (sec): 9.68 - samples/sec: 7236.95 - lr: 0.000037 - momentum: 0.000000
|
127 |
+
2023-10-18 22:37:02,563 epoch 4 - iter 720/1445 - loss 0.14765128 - time (sec): 12.19 - samples/sec: 7264.09 - lr: 0.000036 - momentum: 0.000000
|
128 |
+
2023-10-18 22:37:04,985 epoch 4 - iter 864/1445 - loss 0.14686407 - time (sec): 14.61 - samples/sec: 7275.14 - lr: 0.000036 - momentum: 0.000000
|
129 |
+
2023-10-18 22:37:07,366 epoch 4 - iter 1008/1445 - loss 0.14711603 - time (sec): 16.99 - samples/sec: 7239.14 - lr: 0.000035 - momentum: 0.000000
|
130 |
+
2023-10-18 22:37:09,825 epoch 4 - iter 1152/1445 - loss 0.14929145 - time (sec): 19.45 - samples/sec: 7229.84 - lr: 0.000034 - momentum: 0.000000
|
131 |
+
2023-10-18 22:37:12,202 epoch 4 - iter 1296/1445 - loss 0.14891095 - time (sec): 21.83 - samples/sec: 7256.22 - lr: 0.000034 - momentum: 0.000000
|
132 |
+
2023-10-18 22:37:14,697 epoch 4 - iter 1440/1445 - loss 0.14884730 - time (sec): 24.32 - samples/sec: 7222.75 - lr: 0.000033 - momentum: 0.000000
|
133 |
+
2023-10-18 22:37:14,778 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-18 22:37:14,779 EPOCH 4 done: loss 0.1487 - lr: 0.000033
|
135 |
+
2023-10-18 22:37:16,559 DEV : loss 0.18706543743610382 - f1-score (micro avg) 0.5193
|
136 |
+
2023-10-18 22:37:16,573 saving best model
|
137 |
+
2023-10-18 22:37:16,608 ----------------------------------------------------------------------------------------------------
|
138 |
+
2023-10-18 22:37:18,997 epoch 5 - iter 144/1445 - loss 0.13686569 - time (sec): 2.39 - samples/sec: 7121.40 - lr: 0.000033 - momentum: 0.000000
|
139 |
+
2023-10-18 22:37:21,388 epoch 5 - iter 288/1445 - loss 0.12889149 - time (sec): 4.78 - samples/sec: 7193.65 - lr: 0.000032 - momentum: 0.000000
|
140 |
+
2023-10-18 22:37:23,809 epoch 5 - iter 432/1445 - loss 0.13048179 - time (sec): 7.20 - samples/sec: 7178.96 - lr: 0.000032 - momentum: 0.000000
|
141 |
+
2023-10-18 22:37:26,189 epoch 5 - iter 576/1445 - loss 0.13492520 - time (sec): 9.58 - samples/sec: 7161.88 - lr: 0.000031 - momentum: 0.000000
|
142 |
+
2023-10-18 22:37:28,631 epoch 5 - iter 720/1445 - loss 0.13512870 - time (sec): 12.02 - samples/sec: 7268.82 - lr: 0.000031 - momentum: 0.000000
|
143 |
+
2023-10-18 22:37:31,049 epoch 5 - iter 864/1445 - loss 0.13317304 - time (sec): 14.44 - samples/sec: 7327.92 - lr: 0.000030 - momentum: 0.000000
|
144 |
+
2023-10-18 22:37:33,468 epoch 5 - iter 1008/1445 - loss 0.13354051 - time (sec): 16.86 - samples/sec: 7313.61 - lr: 0.000029 - momentum: 0.000000
|
145 |
+
2023-10-18 22:37:35,963 epoch 5 - iter 1152/1445 - loss 0.13644795 - time (sec): 19.35 - samples/sec: 7314.08 - lr: 0.000029 - momentum: 0.000000
|
146 |
+
2023-10-18 22:37:38,334 epoch 5 - iter 1296/1445 - loss 0.13740177 - time (sec): 21.72 - samples/sec: 7269.49 - lr: 0.000028 - momentum: 0.000000
|
147 |
+
2023-10-18 22:37:40,700 epoch 5 - iter 1440/1445 - loss 0.13509823 - time (sec): 24.09 - samples/sec: 7284.09 - lr: 0.000028 - momentum: 0.000000
|
148 |
+
2023-10-18 22:37:40,787 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-18 22:37:40,788 EPOCH 5 done: loss 0.1353 - lr: 0.000028
|
150 |
+
2023-10-18 22:37:42,905 DEV : loss 0.19456517696380615 - f1-score (micro avg) 0.5458
|
151 |
+
2023-10-18 22:37:42,919 saving best model
|
152 |
+
2023-10-18 22:37:42,954 ----------------------------------------------------------------------------------------------------
|
153 |
+
2023-10-18 22:37:45,320 epoch 6 - iter 144/1445 - loss 0.11571536 - time (sec): 2.37 - samples/sec: 7102.98 - lr: 0.000027 - momentum: 0.000000
|
154 |
+
2023-10-18 22:37:47,734 epoch 6 - iter 288/1445 - loss 0.12384211 - time (sec): 4.78 - samples/sec: 7258.64 - lr: 0.000027 - momentum: 0.000000
|
155 |
+
2023-10-18 22:37:50,160 epoch 6 - iter 432/1445 - loss 0.13225920 - time (sec): 7.21 - samples/sec: 7253.70 - lr: 0.000026 - momentum: 0.000000
|
156 |
+
2023-10-18 22:37:52,606 epoch 6 - iter 576/1445 - loss 0.12932638 - time (sec): 9.65 - samples/sec: 7336.81 - lr: 0.000026 - momentum: 0.000000
|
157 |
+
2023-10-18 22:37:55,094 epoch 6 - iter 720/1445 - loss 0.13232474 - time (sec): 12.14 - samples/sec: 7421.31 - lr: 0.000025 - momentum: 0.000000
|
158 |
+
2023-10-18 22:37:57,438 epoch 6 - iter 864/1445 - loss 0.13260978 - time (sec): 14.48 - samples/sec: 7330.43 - lr: 0.000024 - momentum: 0.000000
|
159 |
+
2023-10-18 22:37:59,561 epoch 6 - iter 1008/1445 - loss 0.13037248 - time (sec): 16.61 - samples/sec: 7438.74 - lr: 0.000024 - momentum: 0.000000
|
160 |
+
2023-10-18 22:38:01,644 epoch 6 - iter 1152/1445 - loss 0.12744445 - time (sec): 18.69 - samples/sec: 7497.79 - lr: 0.000023 - momentum: 0.000000
|
161 |
+
2023-10-18 22:38:03,752 epoch 6 - iter 1296/1445 - loss 0.12733070 - time (sec): 20.80 - samples/sec: 7576.59 - lr: 0.000023 - momentum: 0.000000
|
162 |
+
2023-10-18 22:38:05,837 epoch 6 - iter 1440/1445 - loss 0.12752773 - time (sec): 22.88 - samples/sec: 7676.69 - lr: 0.000022 - momentum: 0.000000
|
163 |
+
2023-10-18 22:38:05,905 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-10-18 22:38:05,905 EPOCH 6 done: loss 0.1276 - lr: 0.000022
|
165 |
+
2023-10-18 22:38:07,688 DEV : loss 0.17982900142669678 - f1-score (micro avg) 0.5598
|
166 |
+
2023-10-18 22:38:07,703 saving best model
|
167 |
+
2023-10-18 22:38:07,740 ----------------------------------------------------------------------------------------------------
|
168 |
+
2023-10-18 22:38:09,970 epoch 7 - iter 144/1445 - loss 0.12277903 - time (sec): 2.23 - samples/sec: 8485.81 - lr: 0.000022 - momentum: 0.000000
|
169 |
+
2023-10-18 22:38:12,362 epoch 7 - iter 288/1445 - loss 0.12334114 - time (sec): 4.62 - samples/sec: 8187.85 - lr: 0.000021 - momentum: 0.000000
|
170 |
+
2023-10-18 22:38:14,733 epoch 7 - iter 432/1445 - loss 0.12246111 - time (sec): 6.99 - samples/sec: 7768.69 - lr: 0.000021 - momentum: 0.000000
|
171 |
+
2023-10-18 22:38:17,119 epoch 7 - iter 576/1445 - loss 0.11964029 - time (sec): 9.38 - samples/sec: 7702.34 - lr: 0.000020 - momentum: 0.000000
|
172 |
+
2023-10-18 22:38:19,570 epoch 7 - iter 720/1445 - loss 0.12254055 - time (sec): 11.83 - samples/sec: 7662.96 - lr: 0.000019 - momentum: 0.000000
|
173 |
+
2023-10-18 22:38:21,895 epoch 7 - iter 864/1445 - loss 0.12133982 - time (sec): 14.15 - samples/sec: 7530.25 - lr: 0.000019 - momentum: 0.000000
|
174 |
+
2023-10-18 22:38:24,278 epoch 7 - iter 1008/1445 - loss 0.12022597 - time (sec): 16.54 - samples/sec: 7530.13 - lr: 0.000018 - momentum: 0.000000
|
175 |
+
2023-10-18 22:38:26,625 epoch 7 - iter 1152/1445 - loss 0.11991213 - time (sec): 18.88 - samples/sec: 7461.88 - lr: 0.000018 - momentum: 0.000000
|
176 |
+
2023-10-18 22:38:29,015 epoch 7 - iter 1296/1445 - loss 0.12152671 - time (sec): 21.27 - samples/sec: 7436.68 - lr: 0.000017 - momentum: 0.000000
|
177 |
+
2023-10-18 22:38:31,393 epoch 7 - iter 1440/1445 - loss 0.11965156 - time (sec): 23.65 - samples/sec: 7414.95 - lr: 0.000017 - momentum: 0.000000
|
178 |
+
2023-10-18 22:38:31,479 ----------------------------------------------------------------------------------------------------
|
179 |
+
2023-10-18 22:38:31,479 EPOCH 7 done: loss 0.1195 - lr: 0.000017
|
180 |
+
2023-10-18 22:38:33,243 DEV : loss 0.17696666717529297 - f1-score (micro avg) 0.5872
|
181 |
+
2023-10-18 22:38:33,257 saving best model
|
182 |
+
2023-10-18 22:38:33,291 ----------------------------------------------------------------------------------------------------
|
183 |
+
2023-10-18 22:38:35,714 epoch 8 - iter 144/1445 - loss 0.11113135 - time (sec): 2.42 - samples/sec: 7759.48 - lr: 0.000016 - momentum: 0.000000
|
184 |
+
2023-10-18 22:38:38,069 epoch 8 - iter 288/1445 - loss 0.11518653 - time (sec): 4.78 - samples/sec: 7472.68 - lr: 0.000016 - momentum: 0.000000
|
185 |
+
2023-10-18 22:38:40,463 epoch 8 - iter 432/1445 - loss 0.11080588 - time (sec): 7.17 - samples/sec: 7551.22 - lr: 0.000015 - momentum: 0.000000
|
186 |
+
2023-10-18 22:38:42,881 epoch 8 - iter 576/1445 - loss 0.11458603 - time (sec): 9.59 - samples/sec: 7566.20 - lr: 0.000014 - momentum: 0.000000
|
187 |
+
2023-10-18 22:38:45,211 epoch 8 - iter 720/1445 - loss 0.11325887 - time (sec): 11.92 - samples/sec: 7521.84 - lr: 0.000014 - momentum: 0.000000
|
188 |
+
2023-10-18 22:38:47,586 epoch 8 - iter 864/1445 - loss 0.11535133 - time (sec): 14.29 - samples/sec: 7499.72 - lr: 0.000013 - momentum: 0.000000
|
189 |
+
2023-10-18 22:38:49,927 epoch 8 - iter 1008/1445 - loss 0.11466974 - time (sec): 16.64 - samples/sec: 7399.78 - lr: 0.000013 - momentum: 0.000000
|
190 |
+
2023-10-18 22:38:52,351 epoch 8 - iter 1152/1445 - loss 0.11643710 - time (sec): 19.06 - samples/sec: 7432.46 - lr: 0.000012 - momentum: 0.000000
|
191 |
+
2023-10-18 22:38:54,772 epoch 8 - iter 1296/1445 - loss 0.11547388 - time (sec): 21.48 - samples/sec: 7376.25 - lr: 0.000012 - momentum: 0.000000
|
192 |
+
2023-10-18 22:38:57,156 epoch 8 - iter 1440/1445 - loss 0.11315324 - time (sec): 23.86 - samples/sec: 7365.81 - lr: 0.000011 - momentum: 0.000000
|
193 |
+
2023-10-18 22:38:57,228 ----------------------------------------------------------------------------------------------------
|
194 |
+
2023-10-18 22:38:57,228 EPOCH 8 done: loss 0.1134 - lr: 0.000011
|
195 |
+
2023-10-18 22:38:59,325 DEV : loss 0.18520388007164001 - f1-score (micro avg) 0.5793
|
196 |
+
2023-10-18 22:38:59,341 ----------------------------------------------------------------------------------------------------
|
197 |
+
2023-10-18 22:39:01,877 epoch 9 - iter 144/1445 - loss 0.10887282 - time (sec): 2.54 - samples/sec: 7397.71 - lr: 0.000011 - momentum: 0.000000
|
198 |
+
2023-10-18 22:39:04,328 epoch 9 - iter 288/1445 - loss 0.11482268 - time (sec): 4.99 - samples/sec: 7375.84 - lr: 0.000010 - momentum: 0.000000
|
199 |
+
2023-10-18 22:39:06,789 epoch 9 - iter 432/1445 - loss 0.11164932 - time (sec): 7.45 - samples/sec: 7362.59 - lr: 0.000009 - momentum: 0.000000
|
200 |
+
2023-10-18 22:39:09,250 epoch 9 - iter 576/1445 - loss 0.11268148 - time (sec): 9.91 - samples/sec: 7377.27 - lr: 0.000009 - momentum: 0.000000
|
201 |
+
2023-10-18 22:39:11,620 epoch 9 - iter 720/1445 - loss 0.11314413 - time (sec): 12.28 - samples/sec: 7299.42 - lr: 0.000008 - momentum: 0.000000
|
202 |
+
2023-10-18 22:39:14,036 epoch 9 - iter 864/1445 - loss 0.11255961 - time (sec): 14.69 - samples/sec: 7329.47 - lr: 0.000008 - momentum: 0.000000
|
203 |
+
2023-10-18 22:39:16,437 epoch 9 - iter 1008/1445 - loss 0.11294305 - time (sec): 17.10 - samples/sec: 7310.68 - lr: 0.000007 - momentum: 0.000000
|
204 |
+
2023-10-18 22:39:18,815 epoch 9 - iter 1152/1445 - loss 0.11188029 - time (sec): 19.47 - samples/sec: 7268.82 - lr: 0.000007 - momentum: 0.000000
|
205 |
+
2023-10-18 22:39:21,293 epoch 9 - iter 1296/1445 - loss 0.10885370 - time (sec): 21.95 - samples/sec: 7270.80 - lr: 0.000006 - momentum: 0.000000
|
206 |
+
2023-10-18 22:39:23,515 epoch 9 - iter 1440/1445 - loss 0.10971729 - time (sec): 24.17 - samples/sec: 7267.20 - lr: 0.000006 - momentum: 0.000000
|
207 |
+
2023-10-18 22:39:23,585 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-10-18 22:39:23,586 EPOCH 9 done: loss 0.1097 - lr: 0.000006
|
209 |
+
2023-10-18 22:39:25,383 DEV : loss 0.1853523552417755 - f1-score (micro avg) 0.5984
|
210 |
+
2023-10-18 22:39:25,398 saving best model
|
211 |
+
2023-10-18 22:39:25,433 ----------------------------------------------------------------------------------------------------
|
212 |
+
2023-10-18 22:39:27,799 epoch 10 - iter 144/1445 - loss 0.09962782 - time (sec): 2.37 - samples/sec: 7376.05 - lr: 0.000005 - momentum: 0.000000
|
213 |
+
2023-10-18 22:39:30,273 epoch 10 - iter 288/1445 - loss 0.08889531 - time (sec): 4.84 - samples/sec: 7309.52 - lr: 0.000004 - momentum: 0.000000
|
214 |
+
2023-10-18 22:39:32,650 epoch 10 - iter 432/1445 - loss 0.09811058 - time (sec): 7.22 - samples/sec: 7371.91 - lr: 0.000004 - momentum: 0.000000
|
215 |
+
2023-10-18 22:39:35,027 epoch 10 - iter 576/1445 - loss 0.10124252 - time (sec): 9.59 - samples/sec: 7326.03 - lr: 0.000003 - momentum: 0.000000
|
216 |
+
2023-10-18 22:39:37,386 epoch 10 - iter 720/1445 - loss 0.10043443 - time (sec): 11.95 - samples/sec: 7433.53 - lr: 0.000003 - momentum: 0.000000
|
217 |
+
2023-10-18 22:39:39,866 epoch 10 - iter 864/1445 - loss 0.10093884 - time (sec): 14.43 - samples/sec: 7359.99 - lr: 0.000002 - momentum: 0.000000
|
218 |
+
2023-10-18 22:39:42,282 epoch 10 - iter 1008/1445 - loss 0.10082663 - time (sec): 16.85 - samples/sec: 7272.67 - lr: 0.000002 - momentum: 0.000000
|
219 |
+
2023-10-18 22:39:44,701 epoch 10 - iter 1152/1445 - loss 0.10177923 - time (sec): 19.27 - samples/sec: 7310.18 - lr: 0.000001 - momentum: 0.000000
|
220 |
+
2023-10-18 22:39:47,102 epoch 10 - iter 1296/1445 - loss 0.10379308 - time (sec): 21.67 - samples/sec: 7271.08 - lr: 0.000001 - momentum: 0.000000
|
221 |
+
2023-10-18 22:39:49,469 epoch 10 - iter 1440/1445 - loss 0.10584891 - time (sec): 24.04 - samples/sec: 7302.09 - lr: 0.000000 - momentum: 0.000000
|
222 |
+
2023-10-18 22:39:49,547 ----------------------------------------------------------------------------------------------------
|
223 |
+
2023-10-18 22:39:49,547 EPOCH 10 done: loss 0.1055 - lr: 0.000000
|
224 |
+
2023-10-18 22:39:51,338 DEV : loss 0.18715591728687286 - f1-score (micro avg) 0.5955
|
225 |
+
2023-10-18 22:39:51,383 ----------------------------------------------------------------------------------------------------
|
226 |
+
2023-10-18 22:39:51,383 Loading model from best epoch ...
|
227 |
+
2023-10-18 22:39:51,466 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
|
228 |
+
2023-10-18 22:39:52,786
|
229 |
+
Results:
|
230 |
+
- F-score (micro) 0.6129
|
231 |
+
- F-score (macro) 0.4332
|
232 |
+
- Accuracy 0.4549
|
233 |
+
|
234 |
+
By class:
|
235 |
+
precision recall f1-score support
|
236 |
+
|
237 |
+
LOC 0.6827 0.7140 0.6980 458
|
238 |
+
PER 0.6284 0.5332 0.5769 482
|
239 |
+
ORG 0.0833 0.0145 0.0247 69
|
240 |
+
|
241 |
+
micro avg 0.6500 0.5798 0.6129 1009
|
242 |
+
macro avg 0.4648 0.4206 0.4332 1009
|
243 |
+
weighted avg 0.6157 0.5798 0.5941 1009
|
244 |
+
|
245 |
+
2023-10-18 22:39:52,786 ----------------------------------------------------------------------------------------------------
|