stefan-it commited on
Commit
f08c8dc
1 Parent(s): 77cfa42

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +245 -0
training.log ADDED
@@ -0,0 +1,245 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-03-26 09:36:08,496 ----------------------------------------------------------------------------------------------------
2
+ 2024-03-26 09:36:08,496 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(31103, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2024-03-26 09:36:08,496 ----------------------------------------------------------------------------------------------------
51
+ 2024-03-26 09:36:08,496 Corpus: 758 train + 94 dev + 96 test sentences
52
+ 2024-03-26 09:36:08,496 ----------------------------------------------------------------------------------------------------
53
+ 2024-03-26 09:36:08,497 Train: 758 sentences
54
+ 2024-03-26 09:36:08,497 (train_with_dev=False, train_with_test=False)
55
+ 2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
56
+ 2024-03-26 09:36:08,497 Training Params:
57
+ 2024-03-26 09:36:08,497 - learning_rate: "3e-05"
58
+ 2024-03-26 09:36:08,497 - mini_batch_size: "8"
59
+ 2024-03-26 09:36:08,497 - max_epochs: "10"
60
+ 2024-03-26 09:36:08,497 - shuffle: "True"
61
+ 2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
62
+ 2024-03-26 09:36:08,497 Plugins:
63
+ 2024-03-26 09:36:08,497 - TensorboardLogger
64
+ 2024-03-26 09:36:08,497 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
66
+ 2024-03-26 09:36:08,497 Final evaluation on model from best epoch (best-model.pt)
67
+ 2024-03-26 09:36:08,497 - metric: "('micro avg', 'f1-score')"
68
+ 2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
69
+ 2024-03-26 09:36:08,497 Computation:
70
+ 2024-03-26 09:36:08,497 - compute on device: cuda:0
71
+ 2024-03-26 09:36:08,497 - embedding storage: none
72
+ 2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
73
+ 2024-03-26 09:36:08,497 Model training base path: "flair-co-funer-gbert_base-bs8-e10-lr3e-05-1"
74
+ 2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
75
+ 2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
76
+ 2024-03-26 09:36:08,497 Logging anything other than scalars to TensorBoard is currently not supported.
77
+ 2024-03-26 09:36:10,077 epoch 1 - iter 9/95 - loss 3.07430171 - time (sec): 1.58 - samples/sec: 1948.72 - lr: 0.000003 - momentum: 0.000000
78
+ 2024-03-26 09:36:11,598 epoch 1 - iter 18/95 - loss 2.93863503 - time (sec): 3.10 - samples/sec: 2015.76 - lr: 0.000005 - momentum: 0.000000
79
+ 2024-03-26 09:36:13,982 epoch 1 - iter 27/95 - loss 2.72796059 - time (sec): 5.48 - samples/sec: 1867.09 - lr: 0.000008 - momentum: 0.000000
80
+ 2024-03-26 09:36:16,197 epoch 1 - iter 36/95 - loss 2.54599224 - time (sec): 7.70 - samples/sec: 1815.54 - lr: 0.000011 - momentum: 0.000000
81
+ 2024-03-26 09:36:18,079 epoch 1 - iter 45/95 - loss 2.40451416 - time (sec): 9.58 - samples/sec: 1822.54 - lr: 0.000014 - momentum: 0.000000
82
+ 2024-03-26 09:36:19,302 epoch 1 - iter 54/95 - loss 2.28598600 - time (sec): 10.80 - samples/sec: 1863.97 - lr: 0.000017 - momentum: 0.000000
83
+ 2024-03-26 09:36:21,006 epoch 1 - iter 63/95 - loss 2.17685156 - time (sec): 12.51 - samples/sec: 1859.95 - lr: 0.000020 - momentum: 0.000000
84
+ 2024-03-26 09:36:22,291 epoch 1 - iter 72/95 - loss 2.07889595 - time (sec): 13.79 - samples/sec: 1888.50 - lr: 0.000022 - momentum: 0.000000
85
+ 2024-03-26 09:36:24,262 epoch 1 - iter 81/95 - loss 1.95823440 - time (sec): 15.76 - samples/sec: 1878.77 - lr: 0.000025 - momentum: 0.000000
86
+ 2024-03-26 09:36:25,582 epoch 1 - iter 90/95 - loss 1.86194154 - time (sec): 17.08 - samples/sec: 1898.79 - lr: 0.000028 - momentum: 0.000000
87
+ 2024-03-26 09:36:26,799 ----------------------------------------------------------------------------------------------------
88
+ 2024-03-26 09:36:26,799 EPOCH 1 done: loss 1.7902 - lr: 0.000028
89
+ 2024-03-26 09:36:27,629 DEV : loss 0.5363726615905762 - f1-score (micro avg) 0.6574
90
+ 2024-03-26 09:36:27,631 saving best model
91
+ 2024-03-26 09:36:27,890 ----------------------------------------------------------------------------------------------------
92
+ 2024-03-26 09:36:29,937 epoch 2 - iter 9/95 - loss 0.60790463 - time (sec): 2.05 - samples/sec: 1804.37 - lr: 0.000030 - momentum: 0.000000
93
+ 2024-03-26 09:36:31,613 epoch 2 - iter 18/95 - loss 0.61325425 - time (sec): 3.72 - samples/sec: 1948.78 - lr: 0.000029 - momentum: 0.000000
94
+ 2024-03-26 09:36:33,424 epoch 2 - iter 27/95 - loss 0.57820829 - time (sec): 5.53 - samples/sec: 1863.01 - lr: 0.000029 - momentum: 0.000000
95
+ 2024-03-26 09:36:35,185 epoch 2 - iter 36/95 - loss 0.55677353 - time (sec): 7.29 - samples/sec: 1832.91 - lr: 0.000029 - momentum: 0.000000
96
+ 2024-03-26 09:36:37,083 epoch 2 - iter 45/95 - loss 0.52219029 - time (sec): 9.19 - samples/sec: 1842.27 - lr: 0.000028 - momentum: 0.000000
97
+ 2024-03-26 09:36:39,277 epoch 2 - iter 54/95 - loss 0.48907984 - time (sec): 11.39 - samples/sec: 1813.28 - lr: 0.000028 - momentum: 0.000000
98
+ 2024-03-26 09:36:40,585 epoch 2 - iter 63/95 - loss 0.48219691 - time (sec): 12.69 - samples/sec: 1855.74 - lr: 0.000028 - momentum: 0.000000
99
+ 2024-03-26 09:36:41,911 epoch 2 - iter 72/95 - loss 0.46533736 - time (sec): 14.02 - samples/sec: 1886.75 - lr: 0.000028 - momentum: 0.000000
100
+ 2024-03-26 09:36:43,701 epoch 2 - iter 81/95 - loss 0.45301630 - time (sec): 15.81 - samples/sec: 1872.24 - lr: 0.000027 - momentum: 0.000000
101
+ 2024-03-26 09:36:45,350 epoch 2 - iter 90/95 - loss 0.44355465 - time (sec): 17.46 - samples/sec: 1869.23 - lr: 0.000027 - momentum: 0.000000
102
+ 2024-03-26 09:36:46,274 ----------------------------------------------------------------------------------------------------
103
+ 2024-03-26 09:36:46,274 EPOCH 2 done: loss 0.4355 - lr: 0.000027
104
+ 2024-03-26 09:36:47,165 DEV : loss 0.2855934500694275 - f1-score (micro avg) 0.828
105
+ 2024-03-26 09:36:47,166 saving best model
106
+ 2024-03-26 09:36:47,592 ----------------------------------------------------------------------------------------------------
107
+ 2024-03-26 09:36:49,533 epoch 3 - iter 9/95 - loss 0.34907051 - time (sec): 1.94 - samples/sec: 1730.91 - lr: 0.000026 - momentum: 0.000000
108
+ 2024-03-26 09:36:51,457 epoch 3 - iter 18/95 - loss 0.29994757 - time (sec): 3.86 - samples/sec: 1741.95 - lr: 0.000026 - momentum: 0.000000
109
+ 2024-03-26 09:36:52,799 epoch 3 - iter 27/95 - loss 0.27836668 - time (sec): 5.21 - samples/sec: 1837.71 - lr: 0.000026 - momentum: 0.000000
110
+ 2024-03-26 09:36:55,264 epoch 3 - iter 36/95 - loss 0.27034037 - time (sec): 7.67 - samples/sec: 1762.67 - lr: 0.000025 - momentum: 0.000000
111
+ 2024-03-26 09:36:57,482 epoch 3 - iter 45/95 - loss 0.25836891 - time (sec): 9.89 - samples/sec: 1795.48 - lr: 0.000025 - momentum: 0.000000
112
+ 2024-03-26 09:36:58,646 epoch 3 - iter 54/95 - loss 0.25201034 - time (sec): 11.05 - samples/sec: 1853.93 - lr: 0.000025 - momentum: 0.000000
113
+ 2024-03-26 09:37:00,551 epoch 3 - iter 63/95 - loss 0.24155981 - time (sec): 12.96 - samples/sec: 1838.34 - lr: 0.000025 - momentum: 0.000000
114
+ 2024-03-26 09:37:02,161 epoch 3 - iter 72/95 - loss 0.23080357 - time (sec): 14.57 - samples/sec: 1843.73 - lr: 0.000024 - momentum: 0.000000
115
+ 2024-03-26 09:37:03,897 epoch 3 - iter 81/95 - loss 0.23000286 - time (sec): 16.30 - samples/sec: 1834.77 - lr: 0.000024 - momentum: 0.000000
116
+ 2024-03-26 09:37:06,051 epoch 3 - iter 90/95 - loss 0.22191158 - time (sec): 18.46 - samples/sec: 1804.80 - lr: 0.000024 - momentum: 0.000000
117
+ 2024-03-26 09:37:06,522 ----------------------------------------------------------------------------------------------------
118
+ 2024-03-26 09:37:06,522 EPOCH 3 done: loss 0.2220 - lr: 0.000024
119
+ 2024-03-26 09:37:07,412 DEV : loss 0.24312740564346313 - f1-score (micro avg) 0.8552
120
+ 2024-03-26 09:37:07,413 saving best model
121
+ 2024-03-26 09:37:07,838 ----------------------------------------------------------------------------------------------------
122
+ 2024-03-26 09:37:09,430 epoch 4 - iter 9/95 - loss 0.21131525 - time (sec): 1.59 - samples/sec: 2025.19 - lr: 0.000023 - momentum: 0.000000
123
+ 2024-03-26 09:37:11,440 epoch 4 - iter 18/95 - loss 0.17490390 - time (sec): 3.60 - samples/sec: 1791.42 - lr: 0.000023 - momentum: 0.000000
124
+ 2024-03-26 09:37:13,208 epoch 4 - iter 27/95 - loss 0.17565118 - time (sec): 5.37 - samples/sec: 1814.83 - lr: 0.000022 - momentum: 0.000000
125
+ 2024-03-26 09:37:15,745 epoch 4 - iter 36/95 - loss 0.15298936 - time (sec): 7.91 - samples/sec: 1742.92 - lr: 0.000022 - momentum: 0.000000
126
+ 2024-03-26 09:37:17,412 epoch 4 - iter 45/95 - loss 0.15724109 - time (sec): 9.57 - samples/sec: 1763.81 - lr: 0.000022 - momentum: 0.000000
127
+ 2024-03-26 09:37:18,943 epoch 4 - iter 54/95 - loss 0.15626722 - time (sec): 11.10 - samples/sec: 1816.57 - lr: 0.000022 - momentum: 0.000000
128
+ 2024-03-26 09:37:20,780 epoch 4 - iter 63/95 - loss 0.15817304 - time (sec): 12.94 - samples/sec: 1839.64 - lr: 0.000021 - momentum: 0.000000
129
+ 2024-03-26 09:37:22,037 epoch 4 - iter 72/95 - loss 0.15929185 - time (sec): 14.20 - samples/sec: 1871.53 - lr: 0.000021 - momentum: 0.000000
130
+ 2024-03-26 09:37:23,744 epoch 4 - iter 81/95 - loss 0.15813202 - time (sec): 15.90 - samples/sec: 1860.66 - lr: 0.000021 - momentum: 0.000000
131
+ 2024-03-26 09:37:25,226 epoch 4 - iter 90/95 - loss 0.15491116 - time (sec): 17.39 - samples/sec: 1881.62 - lr: 0.000020 - momentum: 0.000000
132
+ 2024-03-26 09:37:26,124 ----------------------------------------------------------------------------------------------------
133
+ 2024-03-26 09:37:26,124 EPOCH 4 done: loss 0.1537 - lr: 0.000020
134
+ 2024-03-26 09:37:27,018 DEV : loss 0.19133110344409943 - f1-score (micro avg) 0.8897
135
+ 2024-03-26 09:37:27,019 saving best model
136
+ 2024-03-26 09:37:27,449 ----------------------------------------------------------------------------------------------------
137
+ 2024-03-26 09:37:29,172 epoch 5 - iter 9/95 - loss 0.10369406 - time (sec): 1.72 - samples/sec: 1839.54 - lr: 0.000020 - momentum: 0.000000
138
+ 2024-03-26 09:37:31,302 epoch 5 - iter 18/95 - loss 0.11158756 - time (sec): 3.85 - samples/sec: 1740.67 - lr: 0.000019 - momentum: 0.000000
139
+ 2024-03-26 09:37:32,861 epoch 5 - iter 27/95 - loss 0.10769290 - time (sec): 5.41 - samples/sec: 1792.98 - lr: 0.000019 - momentum: 0.000000
140
+ 2024-03-26 09:37:34,526 epoch 5 - iter 36/95 - loss 0.10479897 - time (sec): 7.07 - samples/sec: 1783.06 - lr: 0.000019 - momentum: 0.000000
141
+ 2024-03-26 09:37:36,200 epoch 5 - iter 45/95 - loss 0.11422202 - time (sec): 8.75 - samples/sec: 1833.65 - lr: 0.000019 - momentum: 0.000000
142
+ 2024-03-26 09:37:37,790 epoch 5 - iter 54/95 - loss 0.11984042 - time (sec): 10.34 - samples/sec: 1881.06 - lr: 0.000018 - momentum: 0.000000
143
+ 2024-03-26 09:37:39,623 epoch 5 - iter 63/95 - loss 0.11827238 - time (sec): 12.17 - samples/sec: 1861.37 - lr: 0.000018 - momentum: 0.000000
144
+ 2024-03-26 09:37:41,837 epoch 5 - iter 72/95 - loss 0.10954916 - time (sec): 14.39 - samples/sec: 1886.33 - lr: 0.000018 - momentum: 0.000000
145
+ 2024-03-26 09:37:43,075 epoch 5 - iter 81/95 - loss 0.11054361 - time (sec): 15.62 - samples/sec: 1906.03 - lr: 0.000017 - momentum: 0.000000
146
+ 2024-03-26 09:37:45,210 epoch 5 - iter 90/95 - loss 0.10558766 - time (sec): 17.76 - samples/sec: 1864.66 - lr: 0.000017 - momentum: 0.000000
147
+ 2024-03-26 09:37:45,834 ----------------------------------------------------------------------------------------------------
148
+ 2024-03-26 09:37:45,834 EPOCH 5 done: loss 0.1061 - lr: 0.000017
149
+ 2024-03-26 09:37:46,727 DEV : loss 0.19185248017311096 - f1-score (micro avg) 0.8844
150
+ 2024-03-26 09:37:46,728 ----------------------------------------------------------------------------------------------------
151
+ 2024-03-26 09:37:48,293 epoch 6 - iter 9/95 - loss 0.06238215 - time (sec): 1.56 - samples/sec: 1848.13 - lr: 0.000016 - momentum: 0.000000
152
+ 2024-03-26 09:37:50,283 epoch 6 - iter 18/95 - loss 0.08071087 - time (sec): 3.55 - samples/sec: 1845.65 - lr: 0.000016 - momentum: 0.000000
153
+ 2024-03-26 09:37:51,956 epoch 6 - iter 27/95 - loss 0.09114191 - time (sec): 5.23 - samples/sec: 1880.61 - lr: 0.000016 - momentum: 0.000000
154
+ 2024-03-26 09:37:53,597 epoch 6 - iter 36/95 - loss 0.08852203 - time (sec): 6.87 - samples/sec: 1844.90 - lr: 0.000016 - momentum: 0.000000
155
+ 2024-03-26 09:37:55,180 epoch 6 - iter 45/95 - loss 0.09173645 - time (sec): 8.45 - samples/sec: 1860.77 - lr: 0.000015 - momentum: 0.000000
156
+ 2024-03-26 09:37:57,168 epoch 6 - iter 54/95 - loss 0.09174916 - time (sec): 10.44 - samples/sec: 1841.69 - lr: 0.000015 - momentum: 0.000000
157
+ 2024-03-26 09:37:58,732 epoch 6 - iter 63/95 - loss 0.09285388 - time (sec): 12.00 - samples/sec: 1841.65 - lr: 0.000015 - momentum: 0.000000
158
+ 2024-03-26 09:38:01,527 epoch 6 - iter 72/95 - loss 0.08563516 - time (sec): 14.80 - samples/sec: 1802.02 - lr: 0.000014 - momentum: 0.000000
159
+ 2024-03-26 09:38:03,369 epoch 6 - iter 81/95 - loss 0.08246803 - time (sec): 16.64 - samples/sec: 1809.72 - lr: 0.000014 - momentum: 0.000000
160
+ 2024-03-26 09:38:05,028 epoch 6 - iter 90/95 - loss 0.08342385 - time (sec): 18.30 - samples/sec: 1803.97 - lr: 0.000014 - momentum: 0.000000
161
+ 2024-03-26 09:38:05,640 ----------------------------------------------------------------------------------------------------
162
+ 2024-03-26 09:38:05,640 EPOCH 6 done: loss 0.0855 - lr: 0.000014
163
+ 2024-03-26 09:38:06,540 DEV : loss 0.18254657089710236 - f1-score (micro avg) 0.9094
164
+ 2024-03-26 09:38:06,541 saving best model
165
+ 2024-03-26 09:38:06,964 ----------------------------------------------------------------------------------------------------
166
+ 2024-03-26 09:38:08,276 epoch 7 - iter 9/95 - loss 0.11117729 - time (sec): 1.31 - samples/sec: 2256.04 - lr: 0.000013 - momentum: 0.000000
167
+ 2024-03-26 09:38:09,896 epoch 7 - iter 18/95 - loss 0.09218612 - time (sec): 2.93 - samples/sec: 2003.58 - lr: 0.000013 - momentum: 0.000000
168
+ 2024-03-26 09:38:11,674 epoch 7 - iter 27/95 - loss 0.08775884 - time (sec): 4.71 - samples/sec: 1941.48 - lr: 0.000013 - momentum: 0.000000
169
+ 2024-03-26 09:38:13,520 epoch 7 - iter 36/95 - loss 0.07798637 - time (sec): 6.55 - samples/sec: 1908.65 - lr: 0.000012 - momentum: 0.000000
170
+ 2024-03-26 09:38:15,791 epoch 7 - iter 45/95 - loss 0.07180718 - time (sec): 8.83 - samples/sec: 1856.84 - lr: 0.000012 - momentum: 0.000000
171
+ 2024-03-26 09:38:16,761 epoch 7 - iter 54/95 - loss 0.07085060 - time (sec): 9.80 - samples/sec: 1934.14 - lr: 0.000012 - momentum: 0.000000
172
+ 2024-03-26 09:38:18,601 epoch 7 - iter 63/95 - loss 0.06618084 - time (sec): 11.64 - samples/sec: 1933.38 - lr: 0.000011 - momentum: 0.000000
173
+ 2024-03-26 09:38:20,495 epoch 7 - iter 72/95 - loss 0.06265646 - time (sec): 13.53 - samples/sec: 1893.11 - lr: 0.000011 - momentum: 0.000000
174
+ 2024-03-26 09:38:22,414 epoch 7 - iter 81/95 - loss 0.06409786 - time (sec): 15.45 - samples/sec: 1889.95 - lr: 0.000011 - momentum: 0.000000
175
+ 2024-03-26 09:38:24,340 epoch 7 - iter 90/95 - loss 0.06399918 - time (sec): 17.37 - samples/sec: 1892.31 - lr: 0.000010 - momentum: 0.000000
176
+ 2024-03-26 09:38:25,163 ----------------------------------------------------------------------------------------------------
177
+ 2024-03-26 09:38:25,163 EPOCH 7 done: loss 0.0631 - lr: 0.000010
178
+ 2024-03-26 09:38:26,061 DEV : loss 0.18480655550956726 - f1-score (micro avg) 0.9115
179
+ 2024-03-26 09:38:26,062 saving best model
180
+ 2024-03-26 09:38:26,481 ----------------------------------------------------------------------------------------------------
181
+ 2024-03-26 09:38:28,080 epoch 8 - iter 9/95 - loss 0.05686573 - time (sec): 1.60 - samples/sec: 1872.62 - lr: 0.000010 - momentum: 0.000000
182
+ 2024-03-26 09:38:30,077 epoch 8 - iter 18/95 - loss 0.04837444 - time (sec): 3.59 - samples/sec: 1691.75 - lr: 0.000010 - momentum: 0.000000
183
+ 2024-03-26 09:38:31,629 epoch 8 - iter 27/95 - loss 0.05493262 - time (sec): 5.15 - samples/sec: 1788.84 - lr: 0.000009 - momentum: 0.000000
184
+ 2024-03-26 09:38:33,343 epoch 8 - iter 36/95 - loss 0.05918769 - time (sec): 6.86 - samples/sec: 1835.16 - lr: 0.000009 - momentum: 0.000000
185
+ 2024-03-26 09:38:35,650 epoch 8 - iter 45/95 - loss 0.05018846 - time (sec): 9.17 - samples/sec: 1813.62 - lr: 0.000009 - momentum: 0.000000
186
+ 2024-03-26 09:38:37,948 epoch 8 - iter 54/95 - loss 0.05231928 - time (sec): 11.47 - samples/sec: 1816.24 - lr: 0.000008 - momentum: 0.000000
187
+ 2024-03-26 09:38:39,897 epoch 8 - iter 63/95 - loss 0.05413433 - time (sec): 13.41 - samples/sec: 1820.13 - lr: 0.000008 - momentum: 0.000000
188
+ 2024-03-26 09:38:40,977 epoch 8 - iter 72/95 - loss 0.05401199 - time (sec): 14.49 - samples/sec: 1852.57 - lr: 0.000008 - momentum: 0.000000
189
+ 2024-03-26 09:38:42,638 epoch 8 - iter 81/95 - loss 0.05202463 - time (sec): 16.15 - samples/sec: 1837.26 - lr: 0.000007 - momentum: 0.000000
190
+ 2024-03-26 09:38:44,006 epoch 8 - iter 90/95 - loss 0.05195576 - time (sec): 17.52 - samples/sec: 1852.37 - lr: 0.000007 - momentum: 0.000000
191
+ 2024-03-26 09:38:45,221 ----------------------------------------------------------------------------------------------------
192
+ 2024-03-26 09:38:45,221 EPOCH 8 done: loss 0.0532 - lr: 0.000007
193
+ 2024-03-26 09:38:46,118 DEV : loss 0.1893010288476944 - f1-score (micro avg) 0.9151
194
+ 2024-03-26 09:38:46,119 saving best model
195
+ 2024-03-26 09:38:46,543 ----------------------------------------------------------------------------------------------------
196
+ 2024-03-26 09:38:48,300 epoch 9 - iter 9/95 - loss 0.02851503 - time (sec): 1.75 - samples/sec: 1979.54 - lr: 0.000007 - momentum: 0.000000
197
+ 2024-03-26 09:38:50,234 epoch 9 - iter 18/95 - loss 0.02569522 - time (sec): 3.69 - samples/sec: 1831.60 - lr: 0.000006 - momentum: 0.000000
198
+ 2024-03-26 09:38:52,062 epoch 9 - iter 27/95 - loss 0.02821932 - time (sec): 5.52 - samples/sec: 1780.97 - lr: 0.000006 - momentum: 0.000000
199
+ 2024-03-26 09:38:53,993 epoch 9 - iter 36/95 - loss 0.03965945 - time (sec): 7.45 - samples/sec: 1807.56 - lr: 0.000006 - momentum: 0.000000
200
+ 2024-03-26 09:38:55,880 epoch 9 - iter 45/95 - loss 0.03698830 - time (sec): 9.33 - samples/sec: 1786.27 - lr: 0.000005 - momentum: 0.000000
201
+ 2024-03-26 09:38:57,728 epoch 9 - iter 54/95 - loss 0.03810645 - time (sec): 11.18 - samples/sec: 1819.00 - lr: 0.000005 - momentum: 0.000000
202
+ 2024-03-26 09:38:59,600 epoch 9 - iter 63/95 - loss 0.03945291 - time (sec): 13.06 - samples/sec: 1818.91 - lr: 0.000005 - momentum: 0.000000
203
+ 2024-03-26 09:39:01,180 epoch 9 - iter 72/95 - loss 0.04280555 - time (sec): 14.63 - samples/sec: 1829.37 - lr: 0.000004 - momentum: 0.000000
204
+ 2024-03-26 09:39:02,877 epoch 9 - iter 81/95 - loss 0.04475982 - time (sec): 16.33 - samples/sec: 1820.69 - lr: 0.000004 - momentum: 0.000000
205
+ 2024-03-26 09:39:04,627 epoch 9 - iter 90/95 - loss 0.04242681 - time (sec): 18.08 - samples/sec: 1838.42 - lr: 0.000004 - momentum: 0.000000
206
+ 2024-03-26 09:39:05,120 ----------------------------------------------------------------------------------------------------
207
+ 2024-03-26 09:39:05,120 EPOCH 9 done: loss 0.0436 - lr: 0.000004
208
+ 2024-03-26 09:39:06,018 DEV : loss 0.18302294611930847 - f1-score (micro avg) 0.928
209
+ 2024-03-26 09:39:06,019 saving best model
210
+ 2024-03-26 09:39:06,442 ----------------------------------------------------------------------------------------------------
211
+ 2024-03-26 09:39:07,911 epoch 10 - iter 9/95 - loss 0.01430248 - time (sec): 1.47 - samples/sec: 1892.14 - lr: 0.000003 - momentum: 0.000000
212
+ 2024-03-26 09:39:09,716 epoch 10 - iter 18/95 - loss 0.02440600 - time (sec): 3.27 - samples/sec: 1847.50 - lr: 0.000003 - momentum: 0.000000
213
+ 2024-03-26 09:39:11,840 epoch 10 - iter 27/95 - loss 0.03100064 - time (sec): 5.40 - samples/sec: 1791.55 - lr: 0.000003 - momentum: 0.000000
214
+ 2024-03-26 09:39:13,686 epoch 10 - iter 36/95 - loss 0.03796915 - time (sec): 7.24 - samples/sec: 1810.96 - lr: 0.000002 - momentum: 0.000000
215
+ 2024-03-26 09:39:14,849 epoch 10 - iter 45/95 - loss 0.03903990 - time (sec): 8.40 - samples/sec: 1864.85 - lr: 0.000002 - momentum: 0.000000
216
+ 2024-03-26 09:39:16,746 epoch 10 - iter 54/95 - loss 0.04257694 - time (sec): 10.30 - samples/sec: 1848.08 - lr: 0.000002 - momentum: 0.000000
217
+ 2024-03-26 09:39:18,118 epoch 10 - iter 63/95 - loss 0.04294533 - time (sec): 11.67 - samples/sec: 1861.40 - lr: 0.000001 - momentum: 0.000000
218
+ 2024-03-26 09:39:20,341 epoch 10 - iter 72/95 - loss 0.03823652 - time (sec): 13.90 - samples/sec: 1843.12 - lr: 0.000001 - momentum: 0.000000
219
+ 2024-03-26 09:39:22,628 epoch 10 - iter 81/95 - loss 0.04078223 - time (sec): 16.18 - samples/sec: 1824.51 - lr: 0.000001 - momentum: 0.000000
220
+ 2024-03-26 09:39:24,462 epoch 10 - iter 90/95 - loss 0.03877567 - time (sec): 18.02 - samples/sec: 1816.73 - lr: 0.000000 - momentum: 0.000000
221
+ 2024-03-26 09:39:25,470 ----------------------------------------------------------------------------------------------------
222
+ 2024-03-26 09:39:25,470 EPOCH 10 done: loss 0.0376 - lr: 0.000000
223
+ 2024-03-26 09:39:26,370 DEV : loss 0.1856098622083664 - f1-score (micro avg) 0.927
224
+ 2024-03-26 09:39:26,654 ----------------------------------------------------------------------------------------------------
225
+ 2024-03-26 09:39:26,655 Loading model from best epoch ...
226
+ 2024-03-26 09:39:27,522 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
227
+ 2024-03-26 09:39:28,274
228
+ Results:
229
+ - F-score (micro) 0.9126
230
+ - F-score (macro) 0.6926
231
+ - Accuracy 0.8452
232
+
233
+ By class:
234
+ precision recall f1-score support
235
+
236
+ Unternehmen 0.9331 0.8910 0.9115 266
237
+ Auslagerung 0.8626 0.9076 0.8845 249
238
+ Ort 0.9635 0.9851 0.9742 134
239
+ Software 0.0000 0.0000 0.0000 0
240
+
241
+ micro avg 0.9084 0.9168 0.9126 649
242
+ macro avg 0.6898 0.6959 0.6926 649
243
+ weighted avg 0.9123 0.9168 0.9141 649
244
+
245
+ 2024-03-26 09:39:28,274 ----------------------------------------------------------------------------------------------------