Upload ./training.log with huggingface_hub
Browse files- training.log +261 -0
training.log
ADDED
@@ -0,0 +1,261 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-18 23:49:39,844 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-18 23:49:39,845 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(31103, 768)
|
7 |
+
(position_embeddings): Embedding(512, 768)
|
8 |
+
(token_type_embeddings): Embedding(2, 768)
|
9 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-11): 12 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=768, out_features=768, bias=True)
|
18 |
+
(key): Linear(in_features=768, out_features=768, bias=True)
|
19 |
+
(value): Linear(in_features=768, out_features=768, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=768, out_features=3072, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=3072, out_features=768, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=768, out_features=768, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=768, out_features=81, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-18 23:49:39,845 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-18 23:49:39,845 Corpus: 6900 train + 1576 dev + 1833 test sentences
|
52 |
+
2023-10-18 23:49:39,846 ----------------------------------------------------------------------------------------------------
|
53 |
+
2023-10-18 23:49:39,846 Train: 6900 sentences
|
54 |
+
2023-10-18 23:49:39,846 (train_with_dev=False, train_with_test=False)
|
55 |
+
2023-10-18 23:49:39,846 ----------------------------------------------------------------------------------------------------
|
56 |
+
2023-10-18 23:49:39,846 Training Params:
|
57 |
+
2023-10-18 23:49:39,846 - learning_rate: "5e-05"
|
58 |
+
2023-10-18 23:49:39,846 - mini_batch_size: "16"
|
59 |
+
2023-10-18 23:49:39,846 - max_epochs: "10"
|
60 |
+
2023-10-18 23:49:39,846 - shuffle: "True"
|
61 |
+
2023-10-18 23:49:39,846 ----------------------------------------------------------------------------------------------------
|
62 |
+
2023-10-18 23:49:39,846 Plugins:
|
63 |
+
2023-10-18 23:49:39,846 - TensorboardLogger
|
64 |
+
2023-10-18 23:49:39,846 - LinearScheduler | warmup_fraction: '0.1'
|
65 |
+
2023-10-18 23:49:39,846 ----------------------------------------------------------------------------------------------------
|
66 |
+
2023-10-18 23:49:39,846 Final evaluation on model from best epoch (best-model.pt)
|
67 |
+
2023-10-18 23:49:39,846 - metric: "('micro avg', 'f1-score')"
|
68 |
+
2023-10-18 23:49:39,846 ----------------------------------------------------------------------------------------------------
|
69 |
+
2023-10-18 23:49:39,846 Computation:
|
70 |
+
2023-10-18 23:49:39,846 - compute on device: cuda:0
|
71 |
+
2023-10-18 23:49:39,847 - embedding storage: none
|
72 |
+
2023-10-18 23:49:39,847 ----------------------------------------------------------------------------------------------------
|
73 |
+
2023-10-18 23:49:39,847 Model training base path: "autotrain-flair-mobie-gbert_base-bs16-e10-lr5e-05-1"
|
74 |
+
2023-10-18 23:49:39,847 ----------------------------------------------------------------------------------------------------
|
75 |
+
2023-10-18 23:49:39,847 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-18 23:49:39,847 Logging anything other than scalars to TensorBoard is currently not supported.
|
77 |
+
2023-10-18 23:49:54,007 epoch 1 - iter 43/432 - loss 4.54226396 - time (sec): 14.16 - samples/sec: 427.65 - lr: 0.000005 - momentum: 0.000000
|
78 |
+
2023-10-18 23:50:08,103 epoch 1 - iter 86/432 - loss 3.39025495 - time (sec): 28.26 - samples/sec: 429.20 - lr: 0.000010 - momentum: 0.000000
|
79 |
+
2023-10-18 23:50:22,380 epoch 1 - iter 129/432 - loss 2.82298352 - time (sec): 42.53 - samples/sec: 431.35 - lr: 0.000015 - momentum: 0.000000
|
80 |
+
2023-10-18 23:50:35,740 epoch 1 - iter 172/432 - loss 2.47385817 - time (sec): 55.89 - samples/sec: 437.97 - lr: 0.000020 - momentum: 0.000000
|
81 |
+
2023-10-18 23:50:49,999 epoch 1 - iter 215/432 - loss 2.21787123 - time (sec): 70.15 - samples/sec: 433.61 - lr: 0.000025 - momentum: 0.000000
|
82 |
+
2023-10-18 23:51:03,183 epoch 1 - iter 258/432 - loss 2.00329484 - time (sec): 83.33 - samples/sec: 440.26 - lr: 0.000030 - momentum: 0.000000
|
83 |
+
2023-10-18 23:51:16,381 epoch 1 - iter 301/432 - loss 1.82489643 - time (sec): 96.53 - samples/sec: 447.14 - lr: 0.000035 - momentum: 0.000000
|
84 |
+
2023-10-18 23:51:30,192 epoch 1 - iter 344/432 - loss 1.69046523 - time (sec): 110.34 - samples/sec: 445.93 - lr: 0.000040 - momentum: 0.000000
|
85 |
+
2023-10-18 23:51:43,742 epoch 1 - iter 387/432 - loss 1.57702628 - time (sec): 123.89 - samples/sec: 444.99 - lr: 0.000045 - momentum: 0.000000
|
86 |
+
2023-10-18 23:51:58,002 epoch 1 - iter 430/432 - loss 1.47019726 - time (sec): 138.15 - samples/sec: 446.59 - lr: 0.000050 - momentum: 0.000000
|
87 |
+
2023-10-18 23:51:58,500 ----------------------------------------------------------------------------------------------------
|
88 |
+
2023-10-18 23:51:58,501 EPOCH 1 done: loss 1.4681 - lr: 0.000050
|
89 |
+
2023-10-18 23:52:10,741 DEV : loss 0.48693621158599854 - f1-score (micro avg) 0.7049
|
90 |
+
2023-10-18 23:52:10,765 saving best model
|
91 |
+
2023-10-18 23:52:11,236 ----------------------------------------------------------------------------------------------------
|
92 |
+
2023-10-18 23:52:24,640 epoch 2 - iter 43/432 - loss 0.49120434 - time (sec): 13.40 - samples/sec: 475.14 - lr: 0.000049 - momentum: 0.000000
|
93 |
+
2023-10-18 23:52:37,470 epoch 2 - iter 86/432 - loss 0.47518885 - time (sec): 26.23 - samples/sec: 473.80 - lr: 0.000049 - momentum: 0.000000
|
94 |
+
2023-10-18 23:52:50,585 epoch 2 - iter 129/432 - loss 0.46930182 - time (sec): 39.35 - samples/sec: 467.10 - lr: 0.000048 - momentum: 0.000000
|
95 |
+
2023-10-18 23:53:03,976 epoch 2 - iter 172/432 - loss 0.45056911 - time (sec): 52.74 - samples/sec: 469.05 - lr: 0.000048 - momentum: 0.000000
|
96 |
+
2023-10-18 23:53:18,853 epoch 2 - iter 215/432 - loss 0.43792445 - time (sec): 67.62 - samples/sec: 458.89 - lr: 0.000047 - momentum: 0.000000
|
97 |
+
2023-10-18 23:53:32,607 epoch 2 - iter 258/432 - loss 0.43015339 - time (sec): 81.37 - samples/sec: 460.55 - lr: 0.000047 - momentum: 0.000000
|
98 |
+
2023-10-18 23:53:46,790 epoch 2 - iter 301/432 - loss 0.41739009 - time (sec): 95.55 - samples/sec: 457.84 - lr: 0.000046 - momentum: 0.000000
|
99 |
+
2023-10-18 23:54:00,936 epoch 2 - iter 344/432 - loss 0.40919722 - time (sec): 109.70 - samples/sec: 452.88 - lr: 0.000046 - momentum: 0.000000
|
100 |
+
2023-10-18 23:54:15,261 epoch 2 - iter 387/432 - loss 0.39789179 - time (sec): 124.02 - samples/sec: 448.77 - lr: 0.000045 - momentum: 0.000000
|
101 |
+
2023-10-18 23:54:30,460 epoch 2 - iter 430/432 - loss 0.39342103 - time (sec): 139.22 - samples/sec: 443.36 - lr: 0.000044 - momentum: 0.000000
|
102 |
+
2023-10-18 23:54:30,977 ----------------------------------------------------------------------------------------------------
|
103 |
+
2023-10-18 23:54:30,978 EPOCH 2 done: loss 0.3932 - lr: 0.000044
|
104 |
+
2023-10-18 23:54:43,400 DEV : loss 0.3221401870250702 - f1-score (micro avg) 0.7999
|
105 |
+
2023-10-18 23:54:43,424 saving best model
|
106 |
+
2023-10-18 23:54:44,719 ----------------------------------------------------------------------------------------------------
|
107 |
+
2023-10-18 23:54:58,597 epoch 3 - iter 43/432 - loss 0.26073780 - time (sec): 13.88 - samples/sec: 440.91 - lr: 0.000044 - momentum: 0.000000
|
108 |
+
2023-10-18 23:55:12,781 epoch 3 - iter 86/432 - loss 0.26640971 - time (sec): 28.06 - samples/sec: 433.46 - lr: 0.000043 - momentum: 0.000000
|
109 |
+
2023-10-18 23:55:27,533 epoch 3 - iter 129/432 - loss 0.25801074 - time (sec): 42.81 - samples/sec: 424.09 - lr: 0.000043 - momentum: 0.000000
|
110 |
+
2023-10-18 23:55:42,492 epoch 3 - iter 172/432 - loss 0.24957799 - time (sec): 57.77 - samples/sec: 421.11 - lr: 0.000042 - momentum: 0.000000
|
111 |
+
2023-10-18 23:55:57,651 epoch 3 - iter 215/432 - loss 0.25398995 - time (sec): 72.93 - samples/sec: 417.81 - lr: 0.000042 - momentum: 0.000000
|
112 |
+
2023-10-18 23:56:12,408 epoch 3 - iter 258/432 - loss 0.25510902 - time (sec): 87.69 - samples/sec: 421.63 - lr: 0.000041 - momentum: 0.000000
|
113 |
+
2023-10-18 23:56:27,276 epoch 3 - iter 301/432 - loss 0.25604031 - time (sec): 102.55 - samples/sec: 420.25 - lr: 0.000041 - momentum: 0.000000
|
114 |
+
2023-10-18 23:56:41,333 epoch 3 - iter 344/432 - loss 0.25487803 - time (sec): 116.61 - samples/sec: 424.44 - lr: 0.000040 - momentum: 0.000000
|
115 |
+
2023-10-18 23:56:55,954 epoch 3 - iter 387/432 - loss 0.25224648 - time (sec): 131.23 - samples/sec: 425.07 - lr: 0.000039 - momentum: 0.000000
|
116 |
+
2023-10-18 23:57:10,652 epoch 3 - iter 430/432 - loss 0.24985880 - time (sec): 145.93 - samples/sec: 422.88 - lr: 0.000039 - momentum: 0.000000
|
117 |
+
2023-10-18 23:57:11,301 ----------------------------------------------------------------------------------------------------
|
118 |
+
2023-10-18 23:57:11,301 EPOCH 3 done: loss 0.2495 - lr: 0.000039
|
119 |
+
2023-10-18 23:57:24,474 DEV : loss 0.2786136269569397 - f1-score (micro avg) 0.8259
|
120 |
+
2023-10-18 23:57:24,498 saving best model
|
121 |
+
2023-10-18 23:57:25,777 ----------------------------------------------------------------------------------------------------
|
122 |
+
2023-10-18 23:57:39,701 epoch 4 - iter 43/432 - loss 0.17159693 - time (sec): 13.92 - samples/sec: 472.47 - lr: 0.000038 - momentum: 0.000000
|
123 |
+
2023-10-18 23:57:55,199 epoch 4 - iter 86/432 - loss 0.17789331 - time (sec): 29.42 - samples/sec: 433.90 - lr: 0.000038 - momentum: 0.000000
|
124 |
+
2023-10-18 23:58:10,250 epoch 4 - iter 129/432 - loss 0.17886504 - time (sec): 44.47 - samples/sec: 428.99 - lr: 0.000037 - momentum: 0.000000
|
125 |
+
2023-10-18 23:58:24,351 epoch 4 - iter 172/432 - loss 0.17380496 - time (sec): 58.57 - samples/sec: 428.72 - lr: 0.000037 - momentum: 0.000000
|
126 |
+
2023-10-18 23:58:40,004 epoch 4 - iter 215/432 - loss 0.17639725 - time (sec): 74.23 - samples/sec: 426.47 - lr: 0.000036 - momentum: 0.000000
|
127 |
+
2023-10-18 23:58:55,251 epoch 4 - iter 258/432 - loss 0.17695320 - time (sec): 89.47 - samples/sec: 425.05 - lr: 0.000036 - momentum: 0.000000
|
128 |
+
2023-10-18 23:59:10,163 epoch 4 - iter 301/432 - loss 0.17433887 - time (sec): 104.39 - samples/sec: 422.34 - lr: 0.000035 - momentum: 0.000000
|
129 |
+
2023-10-18 23:59:24,925 epoch 4 - iter 344/432 - loss 0.17056141 - time (sec): 119.15 - samples/sec: 416.50 - lr: 0.000034 - momentum: 0.000000
|
130 |
+
2023-10-18 23:59:40,330 epoch 4 - iter 387/432 - loss 0.17233317 - time (sec): 134.55 - samples/sec: 414.20 - lr: 0.000034 - momentum: 0.000000
|
131 |
+
2023-10-18 23:59:55,682 epoch 4 - iter 430/432 - loss 0.17233952 - time (sec): 149.90 - samples/sec: 411.30 - lr: 0.000033 - momentum: 0.000000
|
132 |
+
2023-10-18 23:59:56,286 ----------------------------------------------------------------------------------------------------
|
133 |
+
2023-10-18 23:59:56,286 EPOCH 4 done: loss 0.1719 - lr: 0.000033
|
134 |
+
2023-10-19 00:00:09,553 DEV : loss 0.3027258515357971 - f1-score (micro avg) 0.833
|
135 |
+
2023-10-19 00:00:09,578 saving best model
|
136 |
+
2023-10-19 00:00:10,866 ----------------------------------------------------------------------------------------------------
|
137 |
+
2023-10-19 00:00:24,962 epoch 5 - iter 43/432 - loss 0.13224073 - time (sec): 14.09 - samples/sec: 412.85 - lr: 0.000033 - momentum: 0.000000
|
138 |
+
2023-10-19 00:00:40,762 epoch 5 - iter 86/432 - loss 0.12870566 - time (sec): 29.89 - samples/sec: 396.63 - lr: 0.000032 - momentum: 0.000000
|
139 |
+
2023-10-19 00:00:56,297 epoch 5 - iter 129/432 - loss 0.12680988 - time (sec): 45.43 - samples/sec: 391.33 - lr: 0.000032 - momentum: 0.000000
|
140 |
+
2023-10-19 00:01:11,298 epoch 5 - iter 172/432 - loss 0.12448631 - time (sec): 60.43 - samples/sec: 397.60 - lr: 0.000031 - momentum: 0.000000
|
141 |
+
2023-10-19 00:01:25,586 epoch 5 - iter 215/432 - loss 0.12976948 - time (sec): 74.72 - samples/sec: 400.65 - lr: 0.000031 - momentum: 0.000000
|
142 |
+
2023-10-19 00:01:40,737 epoch 5 - iter 258/432 - loss 0.13032162 - time (sec): 89.87 - samples/sec: 401.15 - lr: 0.000030 - momentum: 0.000000
|
143 |
+
2023-10-19 00:01:55,989 epoch 5 - iter 301/432 - loss 0.13168277 - time (sec): 105.12 - samples/sec: 405.48 - lr: 0.000029 - momentum: 0.000000
|
144 |
+
2023-10-19 00:02:09,035 epoch 5 - iter 344/432 - loss 0.13191984 - time (sec): 118.17 - samples/sec: 415.52 - lr: 0.000029 - momentum: 0.000000
|
145 |
+
2023-10-19 00:02:24,258 epoch 5 - iter 387/432 - loss 0.13139996 - time (sec): 133.39 - samples/sec: 414.77 - lr: 0.000028 - momentum: 0.000000
|
146 |
+
2023-10-19 00:02:40,019 epoch 5 - iter 430/432 - loss 0.13241412 - time (sec): 149.15 - samples/sec: 413.75 - lr: 0.000028 - momentum: 0.000000
|
147 |
+
2023-10-19 00:02:40,513 ----------------------------------------------------------------------------------------------------
|
148 |
+
2023-10-19 00:02:40,513 EPOCH 5 done: loss 0.1323 - lr: 0.000028
|
149 |
+
2023-10-19 00:02:53,879 DEV : loss 0.3319297134876251 - f1-score (micro avg) 0.8275
|
150 |
+
2023-10-19 00:02:53,909 ----------------------------------------------------------------------------------------------------
|
151 |
+
2023-10-19 00:03:09,440 epoch 6 - iter 43/432 - loss 0.08999368 - time (sec): 15.53 - samples/sec: 399.56 - lr: 0.000027 - momentum: 0.000000
|
152 |
+
2023-10-19 00:03:23,839 epoch 6 - iter 86/432 - loss 0.08539150 - time (sec): 29.93 - samples/sec: 429.70 - lr: 0.000027 - momentum: 0.000000
|
153 |
+
2023-10-19 00:03:38,488 epoch 6 - iter 129/432 - loss 0.09986893 - time (sec): 44.58 - samples/sec: 422.57 - lr: 0.000026 - momentum: 0.000000
|
154 |
+
2023-10-19 00:03:54,568 epoch 6 - iter 172/432 - loss 0.09913431 - time (sec): 60.66 - samples/sec: 415.33 - lr: 0.000026 - momentum: 0.000000
|
155 |
+
2023-10-19 00:04:09,161 epoch 6 - iter 215/432 - loss 0.09906129 - time (sec): 75.25 - samples/sec: 409.41 - lr: 0.000025 - momentum: 0.000000
|
156 |
+
2023-10-19 00:04:23,814 epoch 6 - iter 258/432 - loss 0.09924250 - time (sec): 89.90 - samples/sec: 411.32 - lr: 0.000024 - momentum: 0.000000
|
157 |
+
2023-10-19 00:04:38,652 epoch 6 - iter 301/432 - loss 0.10059479 - time (sec): 104.74 - samples/sec: 412.85 - lr: 0.000024 - momentum: 0.000000
|
158 |
+
2023-10-19 00:04:53,540 epoch 6 - iter 344/432 - loss 0.09936378 - time (sec): 119.63 - samples/sec: 413.63 - lr: 0.000023 - momentum: 0.000000
|
159 |
+
2023-10-19 00:05:08,991 epoch 6 - iter 387/432 - loss 0.09927245 - time (sec): 135.08 - samples/sec: 410.82 - lr: 0.000023 - momentum: 0.000000
|
160 |
+
2023-10-19 00:05:25,338 epoch 6 - iter 430/432 - loss 0.09843081 - time (sec): 151.43 - samples/sec: 407.47 - lr: 0.000022 - momentum: 0.000000
|
161 |
+
2023-10-19 00:05:25,876 ----------------------------------------------------------------------------------------------------
|
162 |
+
2023-10-19 00:05:25,876 EPOCH 6 done: loss 0.0985 - lr: 0.000022
|
163 |
+
2023-10-19 00:05:39,277 DEV : loss 0.3483152687549591 - f1-score (micro avg) 0.8353
|
164 |
+
2023-10-19 00:05:39,301 saving best model
|
165 |
+
2023-10-19 00:05:41,347 ----------------------------------------------------------------------------------------------------
|
166 |
+
2023-10-19 00:05:56,804 epoch 7 - iter 43/432 - loss 0.07646855 - time (sec): 15.46 - samples/sec: 383.21 - lr: 0.000022 - momentum: 0.000000
|
167 |
+
2023-10-19 00:06:12,246 epoch 7 - iter 86/432 - loss 0.07564571 - time (sec): 30.90 - samples/sec: 382.43 - lr: 0.000021 - momentum: 0.000000
|
168 |
+
2023-10-19 00:06:26,486 epoch 7 - iter 129/432 - loss 0.07414267 - time (sec): 45.14 - samples/sec: 405.53 - lr: 0.000021 - momentum: 0.000000
|
169 |
+
2023-10-19 00:06:41,108 epoch 7 - iter 172/432 - loss 0.07110179 - time (sec): 59.76 - samples/sec: 418.47 - lr: 0.000020 - momentum: 0.000000
|
170 |
+
2023-10-19 00:06:56,294 epoch 7 - iter 215/432 - loss 0.07443072 - time (sec): 74.95 - samples/sec: 413.30 - lr: 0.000019 - momentum: 0.000000
|
171 |
+
2023-10-19 00:07:10,672 epoch 7 - iter 258/432 - loss 0.07597555 - time (sec): 89.32 - samples/sec: 417.24 - lr: 0.000019 - momentum: 0.000000
|
172 |
+
2023-10-19 00:07:25,487 epoch 7 - iter 301/432 - loss 0.07746008 - time (sec): 104.14 - samples/sec: 417.75 - lr: 0.000018 - momentum: 0.000000
|
173 |
+
2023-10-19 00:07:40,229 epoch 7 - iter 344/432 - loss 0.07686569 - time (sec): 118.88 - samples/sec: 417.03 - lr: 0.000018 - momentum: 0.000000
|
174 |
+
2023-10-19 00:07:55,052 epoch 7 - iter 387/432 - loss 0.07546838 - time (sec): 133.70 - samples/sec: 413.79 - lr: 0.000017 - momentum: 0.000000
|
175 |
+
2023-10-19 00:08:09,443 epoch 7 - iter 430/432 - loss 0.07435736 - time (sec): 148.09 - samples/sec: 415.90 - lr: 0.000017 - momentum: 0.000000
|
176 |
+
2023-10-19 00:08:10,004 ----------------------------------------------------------------------------------------------------
|
177 |
+
2023-10-19 00:08:10,005 EPOCH 7 done: loss 0.0749 - lr: 0.000017
|
178 |
+
2023-10-19 00:08:22,467 DEV : loss 0.35942888259887695 - f1-score (micro avg) 0.8395
|
179 |
+
2023-10-19 00:08:22,492 saving best model
|
180 |
+
2023-10-19 00:08:23,780 ----------------------------------------------------------------------------------------------------
|
181 |
+
2023-10-19 00:08:37,812 epoch 8 - iter 43/432 - loss 0.05403566 - time (sec): 14.03 - samples/sec: 443.97 - lr: 0.000016 - momentum: 0.000000
|
182 |
+
2023-10-19 00:08:50,468 epoch 8 - iter 86/432 - loss 0.05416407 - time (sec): 26.69 - samples/sec: 465.48 - lr: 0.000016 - momentum: 0.000000
|
183 |
+
2023-10-19 00:09:05,136 epoch 8 - iter 129/432 - loss 0.05231093 - time (sec): 41.35 - samples/sec: 445.46 - lr: 0.000015 - momentum: 0.000000
|
184 |
+
2023-10-19 00:09:19,064 epoch 8 - iter 172/432 - loss 0.05332684 - time (sec): 55.28 - samples/sec: 439.02 - lr: 0.000014 - momentum: 0.000000
|
185 |
+
2023-10-19 00:09:32,357 epoch 8 - iter 215/432 - loss 0.05490906 - time (sec): 68.58 - samples/sec: 442.24 - lr: 0.000014 - momentum: 0.000000
|
186 |
+
2023-10-19 00:09:45,819 epoch 8 - iter 258/432 - loss 0.05470571 - time (sec): 82.04 - samples/sec: 445.59 - lr: 0.000013 - momentum: 0.000000
|
187 |
+
2023-10-19 00:09:59,775 epoch 8 - iter 301/432 - loss 0.05508927 - time (sec): 95.99 - samples/sec: 444.48 - lr: 0.000013 - momentum: 0.000000
|
188 |
+
2023-10-19 00:10:13,551 epoch 8 - iter 344/432 - loss 0.05696792 - time (sec): 109.77 - samples/sec: 443.76 - lr: 0.000012 - momentum: 0.000000
|
189 |
+
2023-10-19 00:10:28,569 epoch 8 - iter 387/432 - loss 0.05709499 - time (sec): 124.79 - samples/sec: 440.02 - lr: 0.000012 - momentum: 0.000000
|
190 |
+
2023-10-19 00:10:42,292 epoch 8 - iter 430/432 - loss 0.05606310 - time (sec): 138.51 - samples/sec: 445.16 - lr: 0.000011 - momentum: 0.000000
|
191 |
+
2023-10-19 00:10:42,821 ----------------------------------------------------------------------------------------------------
|
192 |
+
2023-10-19 00:10:42,821 EPOCH 8 done: loss 0.0560 - lr: 0.000011
|
193 |
+
2023-10-19 00:10:55,201 DEV : loss 0.3855433464050293 - f1-score (micro avg) 0.8399
|
194 |
+
2023-10-19 00:10:55,225 saving best model
|
195 |
+
2023-10-19 00:10:56,513 ----------------------------------------------------------------------------------------------------
|
196 |
+
2023-10-19 00:11:09,243 epoch 9 - iter 43/432 - loss 0.04130631 - time (sec): 12.73 - samples/sec: 503.43 - lr: 0.000011 - momentum: 0.000000
|
197 |
+
2023-10-19 00:11:22,023 epoch 9 - iter 86/432 - loss 0.03767518 - time (sec): 25.51 - samples/sec: 484.19 - lr: 0.000010 - momentum: 0.000000
|
198 |
+
2023-10-19 00:11:35,992 epoch 9 - iter 129/432 - loss 0.04012592 - time (sec): 39.48 - samples/sec: 467.91 - lr: 0.000009 - momentum: 0.000000
|
199 |
+
2023-10-19 00:11:50,170 epoch 9 - iter 172/432 - loss 0.03874753 - time (sec): 53.66 - samples/sec: 461.63 - lr: 0.000009 - momentum: 0.000000
|
200 |
+
2023-10-19 00:12:03,449 epoch 9 - iter 215/432 - loss 0.03749511 - time (sec): 66.93 - samples/sec: 465.10 - lr: 0.000008 - momentum: 0.000000
|
201 |
+
2023-10-19 00:12:17,477 epoch 9 - iter 258/432 - loss 0.03805076 - time (sec): 80.96 - samples/sec: 460.64 - lr: 0.000008 - momentum: 0.000000
|
202 |
+
2023-10-19 00:12:30,864 epoch 9 - iter 301/432 - loss 0.03919902 - time (sec): 94.35 - samples/sec: 462.58 - lr: 0.000007 - momentum: 0.000000
|
203 |
+
2023-10-19 00:12:45,129 epoch 9 - iter 344/432 - loss 0.04055903 - time (sec): 108.62 - samples/sec: 455.14 - lr: 0.000007 - momentum: 0.000000
|
204 |
+
2023-10-19 00:12:59,088 epoch 9 - iter 387/432 - loss 0.04016294 - time (sec): 122.57 - samples/sec: 452.94 - lr: 0.000006 - momentum: 0.000000
|
205 |
+
2023-10-19 00:13:12,844 epoch 9 - iter 430/432 - loss 0.04071803 - time (sec): 136.33 - samples/sec: 452.09 - lr: 0.000006 - momentum: 0.000000
|
206 |
+
2023-10-19 00:13:13,346 ----------------------------------------------------------------------------------------------------
|
207 |
+
2023-10-19 00:13:13,346 EPOCH 9 done: loss 0.0406 - lr: 0.000006
|
208 |
+
2023-10-19 00:13:25,615 DEV : loss 0.398950457572937 - f1-score (micro avg) 0.8446
|
209 |
+
2023-10-19 00:13:25,640 saving best model
|
210 |
+
2023-10-19 00:13:26,920 ----------------------------------------------------------------------------------------------------
|
211 |
+
2023-10-19 00:13:40,602 epoch 10 - iter 43/432 - loss 0.03991636 - time (sec): 13.68 - samples/sec: 438.30 - lr: 0.000005 - momentum: 0.000000
|
212 |
+
2023-10-19 00:13:54,198 epoch 10 - iter 86/432 - loss 0.03709518 - time (sec): 27.28 - samples/sec: 436.13 - lr: 0.000004 - momentum: 0.000000
|
213 |
+
2023-10-19 00:14:07,245 epoch 10 - iter 129/432 - loss 0.03665230 - time (sec): 40.32 - samples/sec: 453.56 - lr: 0.000004 - momentum: 0.000000
|
214 |
+
2023-10-19 00:14:21,656 epoch 10 - iter 172/432 - loss 0.03488483 - time (sec): 54.73 - samples/sec: 449.35 - lr: 0.000003 - momentum: 0.000000
|
215 |
+
2023-10-19 00:14:35,583 epoch 10 - iter 215/432 - loss 0.03260316 - time (sec): 68.66 - samples/sec: 449.20 - lr: 0.000003 - momentum: 0.000000
|
216 |
+
2023-10-19 00:14:50,542 epoch 10 - iter 258/432 - loss 0.03311250 - time (sec): 83.62 - samples/sec: 442.51 - lr: 0.000002 - momentum: 0.000000
|
217 |
+
2023-10-19 00:15:04,584 epoch 10 - iter 301/432 - loss 0.03267573 - time (sec): 97.66 - samples/sec: 440.01 - lr: 0.000002 - momentum: 0.000000
|
218 |
+
2023-10-19 00:15:20,118 epoch 10 - iter 344/432 - loss 0.03135514 - time (sec): 113.20 - samples/sec: 434.14 - lr: 0.000001 - momentum: 0.000000
|
219 |
+
2023-10-19 00:15:35,441 epoch 10 - iter 387/432 - loss 0.03196909 - time (sec): 128.52 - samples/sec: 431.63 - lr: 0.000001 - momentum: 0.000000
|
220 |
+
2023-10-19 00:15:50,021 epoch 10 - iter 430/432 - loss 0.03303907 - time (sec): 143.10 - samples/sec: 430.68 - lr: 0.000000 - momentum: 0.000000
|
221 |
+
2023-10-19 00:15:50,585 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-19 00:15:50,585 EPOCH 10 done: loss 0.0330 - lr: 0.000000
|
223 |
+
2023-10-19 00:16:03,688 DEV : loss 0.41425013542175293 - f1-score (micro avg) 0.8446
|
224 |
+
2023-10-19 00:16:03,714 saving best model
|
225 |
+
2023-10-19 00:16:05,498 ----------------------------------------------------------------------------------------------------
|
226 |
+
2023-10-19 00:16:05,500 Loading model from best epoch ...
|
227 |
+
2023-10-19 00:16:07,893 SequenceTagger predicts: Dictionary with 81 tags: O, S-location-route, B-location-route, E-location-route, I-location-route, S-location-stop, B-location-stop, E-location-stop, I-location-stop, S-trigger, B-trigger, E-trigger, I-trigger, S-organization-company, B-organization-company, E-organization-company, I-organization-company, S-location-city, B-location-city, E-location-city, I-location-city, S-location, B-location, E-location, I-location, S-event-cause, B-event-cause, E-event-cause, I-event-cause, S-location-street, B-location-street, E-location-street, I-location-street, S-time, B-time, E-time, I-time, S-date, B-date, E-date, I-date, S-number, B-number, E-number, I-number, S-duration, B-duration, E-duration, I-duration, S-organization
|
228 |
+
2023-10-19 00:16:25,689
|
229 |
+
Results:
|
230 |
+
- F-score (micro) 0.7706
|
231 |
+
- F-score (macro) 0.5891
|
232 |
+
- Accuracy 0.6737
|
233 |
+
|
234 |
+
By class:
|
235 |
+
precision recall f1-score support
|
236 |
+
|
237 |
+
trigger 0.7277 0.6062 0.6614 833
|
238 |
+
location-stop 0.8421 0.8157 0.8287 765
|
239 |
+
location 0.8096 0.8376 0.8234 665
|
240 |
+
location-city 0.8068 0.8852 0.8441 566
|
241 |
+
date 0.9013 0.8579 0.8791 394
|
242 |
+
location-street 0.9417 0.8782 0.9088 386
|
243 |
+
time 0.7855 0.8867 0.8330 256
|
244 |
+
location-route 0.9053 0.7746 0.8349 284
|
245 |
+
organization-company 0.7838 0.6905 0.7342 252
|
246 |
+
distance 0.9882 1.0000 0.9940 167
|
247 |
+
number 0.6760 0.8121 0.7378 149
|
248 |
+
duration 0.3484 0.3313 0.3396 163
|
249 |
+
event-cause 0.0000 0.0000 0.0000 0
|
250 |
+
disaster-type 0.9310 0.3913 0.5510 69
|
251 |
+
organization 0.5769 0.5357 0.5556 28
|
252 |
+
person 0.5000 1.0000 0.6667 10
|
253 |
+
set 0.0000 0.0000 0.0000 0
|
254 |
+
org-position 0.0000 0.0000 0.0000 1
|
255 |
+
money 0.0000 0.0000 0.0000 0
|
256 |
+
|
257 |
+
micro avg 0.7637 0.7777 0.7706 4988
|
258 |
+
macro avg 0.6065 0.5949 0.5891 4988
|
259 |
+
weighted avg 0.8075 0.7777 0.7888 4988
|
260 |
+
|
261 |
+
2023-10-19 00:16:25,690 ----------------------------------------------------------------------------------------------------
|