File size: 23,769 Bytes
f08c8dc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 |
2024-03-26 09:36:08,496 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,496 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(31103, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2024-03-26 09:36:08,496 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,496 Corpus: 758 train + 94 dev + 96 test sentences
2024-03-26 09:36:08,496 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Train: 758 sentences
2024-03-26 09:36:08,497 (train_with_dev=False, train_with_test=False)
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Training Params:
2024-03-26 09:36:08,497 - learning_rate: "3e-05"
2024-03-26 09:36:08,497 - mini_batch_size: "8"
2024-03-26 09:36:08,497 - max_epochs: "10"
2024-03-26 09:36:08,497 - shuffle: "True"
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Plugins:
2024-03-26 09:36:08,497 - TensorboardLogger
2024-03-26 09:36:08,497 - LinearScheduler | warmup_fraction: '0.1'
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Final evaluation on model from best epoch (best-model.pt)
2024-03-26 09:36:08,497 - metric: "('micro avg', 'f1-score')"
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Computation:
2024-03-26 09:36:08,497 - compute on device: cuda:0
2024-03-26 09:36:08,497 - embedding storage: none
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Model training base path: "flair-co-funer-gbert_base-bs8-e10-lr3e-05-1"
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:08,497 Logging anything other than scalars to TensorBoard is currently not supported.
2024-03-26 09:36:10,077 epoch 1 - iter 9/95 - loss 3.07430171 - time (sec): 1.58 - samples/sec: 1948.72 - lr: 0.000003 - momentum: 0.000000
2024-03-26 09:36:11,598 epoch 1 - iter 18/95 - loss 2.93863503 - time (sec): 3.10 - samples/sec: 2015.76 - lr: 0.000005 - momentum: 0.000000
2024-03-26 09:36:13,982 epoch 1 - iter 27/95 - loss 2.72796059 - time (sec): 5.48 - samples/sec: 1867.09 - lr: 0.000008 - momentum: 0.000000
2024-03-26 09:36:16,197 epoch 1 - iter 36/95 - loss 2.54599224 - time (sec): 7.70 - samples/sec: 1815.54 - lr: 0.000011 - momentum: 0.000000
2024-03-26 09:36:18,079 epoch 1 - iter 45/95 - loss 2.40451416 - time (sec): 9.58 - samples/sec: 1822.54 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:36:19,302 epoch 1 - iter 54/95 - loss 2.28598600 - time (sec): 10.80 - samples/sec: 1863.97 - lr: 0.000017 - momentum: 0.000000
2024-03-26 09:36:21,006 epoch 1 - iter 63/95 - loss 2.17685156 - time (sec): 12.51 - samples/sec: 1859.95 - lr: 0.000020 - momentum: 0.000000
2024-03-26 09:36:22,291 epoch 1 - iter 72/95 - loss 2.07889595 - time (sec): 13.79 - samples/sec: 1888.50 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:36:24,262 epoch 1 - iter 81/95 - loss 1.95823440 - time (sec): 15.76 - samples/sec: 1878.77 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:36:25,582 epoch 1 - iter 90/95 - loss 1.86194154 - time (sec): 17.08 - samples/sec: 1898.79 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:36:26,799 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:26,799 EPOCH 1 done: loss 1.7902 - lr: 0.000028
2024-03-26 09:36:27,629 DEV : loss 0.5363726615905762 - f1-score (micro avg) 0.6574
2024-03-26 09:36:27,631 saving best model
2024-03-26 09:36:27,890 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:29,937 epoch 2 - iter 9/95 - loss 0.60790463 - time (sec): 2.05 - samples/sec: 1804.37 - lr: 0.000030 - momentum: 0.000000
2024-03-26 09:36:31,613 epoch 2 - iter 18/95 - loss 0.61325425 - time (sec): 3.72 - samples/sec: 1948.78 - lr: 0.000029 - momentum: 0.000000
2024-03-26 09:36:33,424 epoch 2 - iter 27/95 - loss 0.57820829 - time (sec): 5.53 - samples/sec: 1863.01 - lr: 0.000029 - momentum: 0.000000
2024-03-26 09:36:35,185 epoch 2 - iter 36/95 - loss 0.55677353 - time (sec): 7.29 - samples/sec: 1832.91 - lr: 0.000029 - momentum: 0.000000
2024-03-26 09:36:37,083 epoch 2 - iter 45/95 - loss 0.52219029 - time (sec): 9.19 - samples/sec: 1842.27 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:36:39,277 epoch 2 - iter 54/95 - loss 0.48907984 - time (sec): 11.39 - samples/sec: 1813.28 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:36:40,585 epoch 2 - iter 63/95 - loss 0.48219691 - time (sec): 12.69 - samples/sec: 1855.74 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:36:41,911 epoch 2 - iter 72/95 - loss 0.46533736 - time (sec): 14.02 - samples/sec: 1886.75 - lr: 0.000028 - momentum: 0.000000
2024-03-26 09:36:43,701 epoch 2 - iter 81/95 - loss 0.45301630 - time (sec): 15.81 - samples/sec: 1872.24 - lr: 0.000027 - momentum: 0.000000
2024-03-26 09:36:45,350 epoch 2 - iter 90/95 - loss 0.44355465 - time (sec): 17.46 - samples/sec: 1869.23 - lr: 0.000027 - momentum: 0.000000
2024-03-26 09:36:46,274 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:46,274 EPOCH 2 done: loss 0.4355 - lr: 0.000027
2024-03-26 09:36:47,165 DEV : loss 0.2855934500694275 - f1-score (micro avg) 0.828
2024-03-26 09:36:47,166 saving best model
2024-03-26 09:36:47,592 ----------------------------------------------------------------------------------------------------
2024-03-26 09:36:49,533 epoch 3 - iter 9/95 - loss 0.34907051 - time (sec): 1.94 - samples/sec: 1730.91 - lr: 0.000026 - momentum: 0.000000
2024-03-26 09:36:51,457 epoch 3 - iter 18/95 - loss 0.29994757 - time (sec): 3.86 - samples/sec: 1741.95 - lr: 0.000026 - momentum: 0.000000
2024-03-26 09:36:52,799 epoch 3 - iter 27/95 - loss 0.27836668 - time (sec): 5.21 - samples/sec: 1837.71 - lr: 0.000026 - momentum: 0.000000
2024-03-26 09:36:55,264 epoch 3 - iter 36/95 - loss 0.27034037 - time (sec): 7.67 - samples/sec: 1762.67 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:36:57,482 epoch 3 - iter 45/95 - loss 0.25836891 - time (sec): 9.89 - samples/sec: 1795.48 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:36:58,646 epoch 3 - iter 54/95 - loss 0.25201034 - time (sec): 11.05 - samples/sec: 1853.93 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:37:00,551 epoch 3 - iter 63/95 - loss 0.24155981 - time (sec): 12.96 - samples/sec: 1838.34 - lr: 0.000025 - momentum: 0.000000
2024-03-26 09:37:02,161 epoch 3 - iter 72/95 - loss 0.23080357 - time (sec): 14.57 - samples/sec: 1843.73 - lr: 0.000024 - momentum: 0.000000
2024-03-26 09:37:03,897 epoch 3 - iter 81/95 - loss 0.23000286 - time (sec): 16.30 - samples/sec: 1834.77 - lr: 0.000024 - momentum: 0.000000
2024-03-26 09:37:06,051 epoch 3 - iter 90/95 - loss 0.22191158 - time (sec): 18.46 - samples/sec: 1804.80 - lr: 0.000024 - momentum: 0.000000
2024-03-26 09:37:06,522 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:06,522 EPOCH 3 done: loss 0.2220 - lr: 0.000024
2024-03-26 09:37:07,412 DEV : loss 0.24312740564346313 - f1-score (micro avg) 0.8552
2024-03-26 09:37:07,413 saving best model
2024-03-26 09:37:07,838 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:09,430 epoch 4 - iter 9/95 - loss 0.21131525 - time (sec): 1.59 - samples/sec: 2025.19 - lr: 0.000023 - momentum: 0.000000
2024-03-26 09:37:11,440 epoch 4 - iter 18/95 - loss 0.17490390 - time (sec): 3.60 - samples/sec: 1791.42 - lr: 0.000023 - momentum: 0.000000
2024-03-26 09:37:13,208 epoch 4 - iter 27/95 - loss 0.17565118 - time (sec): 5.37 - samples/sec: 1814.83 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:37:15,745 epoch 4 - iter 36/95 - loss 0.15298936 - time (sec): 7.91 - samples/sec: 1742.92 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:37:17,412 epoch 4 - iter 45/95 - loss 0.15724109 - time (sec): 9.57 - samples/sec: 1763.81 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:37:18,943 epoch 4 - iter 54/95 - loss 0.15626722 - time (sec): 11.10 - samples/sec: 1816.57 - lr: 0.000022 - momentum: 0.000000
2024-03-26 09:37:20,780 epoch 4 - iter 63/95 - loss 0.15817304 - time (sec): 12.94 - samples/sec: 1839.64 - lr: 0.000021 - momentum: 0.000000
2024-03-26 09:37:22,037 epoch 4 - iter 72/95 - loss 0.15929185 - time (sec): 14.20 - samples/sec: 1871.53 - lr: 0.000021 - momentum: 0.000000
2024-03-26 09:37:23,744 epoch 4 - iter 81/95 - loss 0.15813202 - time (sec): 15.90 - samples/sec: 1860.66 - lr: 0.000021 - momentum: 0.000000
2024-03-26 09:37:25,226 epoch 4 - iter 90/95 - loss 0.15491116 - time (sec): 17.39 - samples/sec: 1881.62 - lr: 0.000020 - momentum: 0.000000
2024-03-26 09:37:26,124 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:26,124 EPOCH 4 done: loss 0.1537 - lr: 0.000020
2024-03-26 09:37:27,018 DEV : loss 0.19133110344409943 - f1-score (micro avg) 0.8897
2024-03-26 09:37:27,019 saving best model
2024-03-26 09:37:27,449 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:29,172 epoch 5 - iter 9/95 - loss 0.10369406 - time (sec): 1.72 - samples/sec: 1839.54 - lr: 0.000020 - momentum: 0.000000
2024-03-26 09:37:31,302 epoch 5 - iter 18/95 - loss 0.11158756 - time (sec): 3.85 - samples/sec: 1740.67 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:37:32,861 epoch 5 - iter 27/95 - loss 0.10769290 - time (sec): 5.41 - samples/sec: 1792.98 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:37:34,526 epoch 5 - iter 36/95 - loss 0.10479897 - time (sec): 7.07 - samples/sec: 1783.06 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:37:36,200 epoch 5 - iter 45/95 - loss 0.11422202 - time (sec): 8.75 - samples/sec: 1833.65 - lr: 0.000019 - momentum: 0.000000
2024-03-26 09:37:37,790 epoch 5 - iter 54/95 - loss 0.11984042 - time (sec): 10.34 - samples/sec: 1881.06 - lr: 0.000018 - momentum: 0.000000
2024-03-26 09:37:39,623 epoch 5 - iter 63/95 - loss 0.11827238 - time (sec): 12.17 - samples/sec: 1861.37 - lr: 0.000018 - momentum: 0.000000
2024-03-26 09:37:41,837 epoch 5 - iter 72/95 - loss 0.10954916 - time (sec): 14.39 - samples/sec: 1886.33 - lr: 0.000018 - momentum: 0.000000
2024-03-26 09:37:43,075 epoch 5 - iter 81/95 - loss 0.11054361 - time (sec): 15.62 - samples/sec: 1906.03 - lr: 0.000017 - momentum: 0.000000
2024-03-26 09:37:45,210 epoch 5 - iter 90/95 - loss 0.10558766 - time (sec): 17.76 - samples/sec: 1864.66 - lr: 0.000017 - momentum: 0.000000
2024-03-26 09:37:45,834 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:45,834 EPOCH 5 done: loss 0.1061 - lr: 0.000017
2024-03-26 09:37:46,727 DEV : loss 0.19185248017311096 - f1-score (micro avg) 0.8844
2024-03-26 09:37:46,728 ----------------------------------------------------------------------------------------------------
2024-03-26 09:37:48,293 epoch 6 - iter 9/95 - loss 0.06238215 - time (sec): 1.56 - samples/sec: 1848.13 - lr: 0.000016 - momentum: 0.000000
2024-03-26 09:37:50,283 epoch 6 - iter 18/95 - loss 0.08071087 - time (sec): 3.55 - samples/sec: 1845.65 - lr: 0.000016 - momentum: 0.000000
2024-03-26 09:37:51,956 epoch 6 - iter 27/95 - loss 0.09114191 - time (sec): 5.23 - samples/sec: 1880.61 - lr: 0.000016 - momentum: 0.000000
2024-03-26 09:37:53,597 epoch 6 - iter 36/95 - loss 0.08852203 - time (sec): 6.87 - samples/sec: 1844.90 - lr: 0.000016 - momentum: 0.000000
2024-03-26 09:37:55,180 epoch 6 - iter 45/95 - loss 0.09173645 - time (sec): 8.45 - samples/sec: 1860.77 - lr: 0.000015 - momentum: 0.000000
2024-03-26 09:37:57,168 epoch 6 - iter 54/95 - loss 0.09174916 - time (sec): 10.44 - samples/sec: 1841.69 - lr: 0.000015 - momentum: 0.000000
2024-03-26 09:37:58,732 epoch 6 - iter 63/95 - loss 0.09285388 - time (sec): 12.00 - samples/sec: 1841.65 - lr: 0.000015 - momentum: 0.000000
2024-03-26 09:38:01,527 epoch 6 - iter 72/95 - loss 0.08563516 - time (sec): 14.80 - samples/sec: 1802.02 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:38:03,369 epoch 6 - iter 81/95 - loss 0.08246803 - time (sec): 16.64 - samples/sec: 1809.72 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:38:05,028 epoch 6 - iter 90/95 - loss 0.08342385 - time (sec): 18.30 - samples/sec: 1803.97 - lr: 0.000014 - momentum: 0.000000
2024-03-26 09:38:05,640 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:05,640 EPOCH 6 done: loss 0.0855 - lr: 0.000014
2024-03-26 09:38:06,540 DEV : loss 0.18254657089710236 - f1-score (micro avg) 0.9094
2024-03-26 09:38:06,541 saving best model
2024-03-26 09:38:06,964 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:08,276 epoch 7 - iter 9/95 - loss 0.11117729 - time (sec): 1.31 - samples/sec: 2256.04 - lr: 0.000013 - momentum: 0.000000
2024-03-26 09:38:09,896 epoch 7 - iter 18/95 - loss 0.09218612 - time (sec): 2.93 - samples/sec: 2003.58 - lr: 0.000013 - momentum: 0.000000
2024-03-26 09:38:11,674 epoch 7 - iter 27/95 - loss 0.08775884 - time (sec): 4.71 - samples/sec: 1941.48 - lr: 0.000013 - momentum: 0.000000
2024-03-26 09:38:13,520 epoch 7 - iter 36/95 - loss 0.07798637 - time (sec): 6.55 - samples/sec: 1908.65 - lr: 0.000012 - momentum: 0.000000
2024-03-26 09:38:15,791 epoch 7 - iter 45/95 - loss 0.07180718 - time (sec): 8.83 - samples/sec: 1856.84 - lr: 0.000012 - momentum: 0.000000
2024-03-26 09:38:16,761 epoch 7 - iter 54/95 - loss 0.07085060 - time (sec): 9.80 - samples/sec: 1934.14 - lr: 0.000012 - momentum: 0.000000
2024-03-26 09:38:18,601 epoch 7 - iter 63/95 - loss 0.06618084 - time (sec): 11.64 - samples/sec: 1933.38 - lr: 0.000011 - momentum: 0.000000
2024-03-26 09:38:20,495 epoch 7 - iter 72/95 - loss 0.06265646 - time (sec): 13.53 - samples/sec: 1893.11 - lr: 0.000011 - momentum: 0.000000
2024-03-26 09:38:22,414 epoch 7 - iter 81/95 - loss 0.06409786 - time (sec): 15.45 - samples/sec: 1889.95 - lr: 0.000011 - momentum: 0.000000
2024-03-26 09:38:24,340 epoch 7 - iter 90/95 - loss 0.06399918 - time (sec): 17.37 - samples/sec: 1892.31 - lr: 0.000010 - momentum: 0.000000
2024-03-26 09:38:25,163 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:25,163 EPOCH 7 done: loss 0.0631 - lr: 0.000010
2024-03-26 09:38:26,061 DEV : loss 0.18480655550956726 - f1-score (micro avg) 0.9115
2024-03-26 09:38:26,062 saving best model
2024-03-26 09:38:26,481 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:28,080 epoch 8 - iter 9/95 - loss 0.05686573 - time (sec): 1.60 - samples/sec: 1872.62 - lr: 0.000010 - momentum: 0.000000
2024-03-26 09:38:30,077 epoch 8 - iter 18/95 - loss 0.04837444 - time (sec): 3.59 - samples/sec: 1691.75 - lr: 0.000010 - momentum: 0.000000
2024-03-26 09:38:31,629 epoch 8 - iter 27/95 - loss 0.05493262 - time (sec): 5.15 - samples/sec: 1788.84 - lr: 0.000009 - momentum: 0.000000
2024-03-26 09:38:33,343 epoch 8 - iter 36/95 - loss 0.05918769 - time (sec): 6.86 - samples/sec: 1835.16 - lr: 0.000009 - momentum: 0.000000
2024-03-26 09:38:35,650 epoch 8 - iter 45/95 - loss 0.05018846 - time (sec): 9.17 - samples/sec: 1813.62 - lr: 0.000009 - momentum: 0.000000
2024-03-26 09:38:37,948 epoch 8 - iter 54/95 - loss 0.05231928 - time (sec): 11.47 - samples/sec: 1816.24 - lr: 0.000008 - momentum: 0.000000
2024-03-26 09:38:39,897 epoch 8 - iter 63/95 - loss 0.05413433 - time (sec): 13.41 - samples/sec: 1820.13 - lr: 0.000008 - momentum: 0.000000
2024-03-26 09:38:40,977 epoch 8 - iter 72/95 - loss 0.05401199 - time (sec): 14.49 - samples/sec: 1852.57 - lr: 0.000008 - momentum: 0.000000
2024-03-26 09:38:42,638 epoch 8 - iter 81/95 - loss 0.05202463 - time (sec): 16.15 - samples/sec: 1837.26 - lr: 0.000007 - momentum: 0.000000
2024-03-26 09:38:44,006 epoch 8 - iter 90/95 - loss 0.05195576 - time (sec): 17.52 - samples/sec: 1852.37 - lr: 0.000007 - momentum: 0.000000
2024-03-26 09:38:45,221 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:45,221 EPOCH 8 done: loss 0.0532 - lr: 0.000007
2024-03-26 09:38:46,118 DEV : loss 0.1893010288476944 - f1-score (micro avg) 0.9151
2024-03-26 09:38:46,119 saving best model
2024-03-26 09:38:46,543 ----------------------------------------------------------------------------------------------------
2024-03-26 09:38:48,300 epoch 9 - iter 9/95 - loss 0.02851503 - time (sec): 1.75 - samples/sec: 1979.54 - lr: 0.000007 - momentum: 0.000000
2024-03-26 09:38:50,234 epoch 9 - iter 18/95 - loss 0.02569522 - time (sec): 3.69 - samples/sec: 1831.60 - lr: 0.000006 - momentum: 0.000000
2024-03-26 09:38:52,062 epoch 9 - iter 27/95 - loss 0.02821932 - time (sec): 5.52 - samples/sec: 1780.97 - lr: 0.000006 - momentum: 0.000000
2024-03-26 09:38:53,993 epoch 9 - iter 36/95 - loss 0.03965945 - time (sec): 7.45 - samples/sec: 1807.56 - lr: 0.000006 - momentum: 0.000000
2024-03-26 09:38:55,880 epoch 9 - iter 45/95 - loss 0.03698830 - time (sec): 9.33 - samples/sec: 1786.27 - lr: 0.000005 - momentum: 0.000000
2024-03-26 09:38:57,728 epoch 9 - iter 54/95 - loss 0.03810645 - time (sec): 11.18 - samples/sec: 1819.00 - lr: 0.000005 - momentum: 0.000000
2024-03-26 09:38:59,600 epoch 9 - iter 63/95 - loss 0.03945291 - time (sec): 13.06 - samples/sec: 1818.91 - lr: 0.000005 - momentum: 0.000000
2024-03-26 09:39:01,180 epoch 9 - iter 72/95 - loss 0.04280555 - time (sec): 14.63 - samples/sec: 1829.37 - lr: 0.000004 - momentum: 0.000000
2024-03-26 09:39:02,877 epoch 9 - iter 81/95 - loss 0.04475982 - time (sec): 16.33 - samples/sec: 1820.69 - lr: 0.000004 - momentum: 0.000000
2024-03-26 09:39:04,627 epoch 9 - iter 90/95 - loss 0.04242681 - time (sec): 18.08 - samples/sec: 1838.42 - lr: 0.000004 - momentum: 0.000000
2024-03-26 09:39:05,120 ----------------------------------------------------------------------------------------------------
2024-03-26 09:39:05,120 EPOCH 9 done: loss 0.0436 - lr: 0.000004
2024-03-26 09:39:06,018 DEV : loss 0.18302294611930847 - f1-score (micro avg) 0.928
2024-03-26 09:39:06,019 saving best model
2024-03-26 09:39:06,442 ----------------------------------------------------------------------------------------------------
2024-03-26 09:39:07,911 epoch 10 - iter 9/95 - loss 0.01430248 - time (sec): 1.47 - samples/sec: 1892.14 - lr: 0.000003 - momentum: 0.000000
2024-03-26 09:39:09,716 epoch 10 - iter 18/95 - loss 0.02440600 - time (sec): 3.27 - samples/sec: 1847.50 - lr: 0.000003 - momentum: 0.000000
2024-03-26 09:39:11,840 epoch 10 - iter 27/95 - loss 0.03100064 - time (sec): 5.40 - samples/sec: 1791.55 - lr: 0.000003 - momentum: 0.000000
2024-03-26 09:39:13,686 epoch 10 - iter 36/95 - loss 0.03796915 - time (sec): 7.24 - samples/sec: 1810.96 - lr: 0.000002 - momentum: 0.000000
2024-03-26 09:39:14,849 epoch 10 - iter 45/95 - loss 0.03903990 - time (sec): 8.40 - samples/sec: 1864.85 - lr: 0.000002 - momentum: 0.000000
2024-03-26 09:39:16,746 epoch 10 - iter 54/95 - loss 0.04257694 - time (sec): 10.30 - samples/sec: 1848.08 - lr: 0.000002 - momentum: 0.000000
2024-03-26 09:39:18,118 epoch 10 - iter 63/95 - loss 0.04294533 - time (sec): 11.67 - samples/sec: 1861.40 - lr: 0.000001 - momentum: 0.000000
2024-03-26 09:39:20,341 epoch 10 - iter 72/95 - loss 0.03823652 - time (sec): 13.90 - samples/sec: 1843.12 - lr: 0.000001 - momentum: 0.000000
2024-03-26 09:39:22,628 epoch 10 - iter 81/95 - loss 0.04078223 - time (sec): 16.18 - samples/sec: 1824.51 - lr: 0.000001 - momentum: 0.000000
2024-03-26 09:39:24,462 epoch 10 - iter 90/95 - loss 0.03877567 - time (sec): 18.02 - samples/sec: 1816.73 - lr: 0.000000 - momentum: 0.000000
2024-03-26 09:39:25,470 ----------------------------------------------------------------------------------------------------
2024-03-26 09:39:25,470 EPOCH 10 done: loss 0.0376 - lr: 0.000000
2024-03-26 09:39:26,370 DEV : loss 0.1856098622083664 - f1-score (micro avg) 0.927
2024-03-26 09:39:26,654 ----------------------------------------------------------------------------------------------------
2024-03-26 09:39:26,655 Loading model from best epoch ...
2024-03-26 09:39:27,522 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
2024-03-26 09:39:28,274
Results:
- F-score (micro) 0.9126
- F-score (macro) 0.6926
- Accuracy 0.8452
By class:
precision recall f1-score support
Unternehmen 0.9331 0.8910 0.9115 266
Auslagerung 0.8626 0.9076 0.8845 249
Ort 0.9635 0.9851 0.9742 134
Software 0.0000 0.0000 0.0000 0
micro avg 0.9084 0.9168 0.9126 649
macro avg 0.6898 0.6959 0.6926 649
weighted avg 0.9123 0.9168 0.9141 649
2024-03-26 09:39:28,274 ----------------------------------------------------------------------------------------------------
|