stefan-it commited on
Commit
dc89aad
1 Parent(s): de24c7d

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4b671b823478832236fcd904179378ef919281003b5fcca7b75a7d6556048f8c
3
+ size 440954373
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 10:23:39 0.0000 0.4002 0.1138 0.2365 0.4811 0.3171 0.1900
3
+ 2 10:28:26 0.0000 0.1695 0.2231 0.2248 0.6913 0.3392 0.2055
4
+ 3 10:33:16 0.0000 0.1163 0.1856 0.3028 0.4318 0.3560 0.2171
5
+ 4 10:37:58 0.0000 0.0856 0.2579 0.2837 0.5455 0.3733 0.2317
6
+ 5 10:42:40 0.0000 0.0604 0.3069 0.3289 0.5644 0.4156 0.2633
7
+ 6 10:47:24 0.0000 0.0429 0.3329 0.2887 0.6364 0.3972 0.2496
8
+ 7 10:52:12 0.0000 0.0314 0.3632 0.3168 0.5795 0.4096 0.2587
9
+ 8 10:57:05 0.0000 0.0203 0.3939 0.3024 0.6174 0.4060 0.2561
10
+ 9 11:02:00 0.0000 0.0147 0.5024 0.2806 0.6345 0.3891 0.2431
11
+ 10 11:06:51 0.0000 0.0093 0.5263 0.2851 0.6383 0.3942 0.2469
runs/events.out.tfevents.1697537938.3ae7c61396a7.1160.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7cc966cbdceee0323f9550c9e520f1f9d984c0dac8ea89cd5ffcb51a41b1507c
3
+ size 1464420
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,239 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-17 10:18:58,581 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-17 10:18:58,582 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): ElectraModel(
5
+ (embeddings): ElectraEmbeddings(
6
+ (word_embeddings): Embedding(32001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): ElectraEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x ElectraLayer(
15
+ (attention): ElectraAttention(
16
+ (self): ElectraSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): ElectraSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): ElectraIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): ElectraOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ )
41
+ )
42
+ (locked_dropout): LockedDropout(p=0.5)
43
+ (linear): Linear(in_features=768, out_features=17, bias=True)
44
+ (loss_function): CrossEntropyLoss()
45
+ )"
46
+ 2023-10-17 10:18:58,583 ----------------------------------------------------------------------------------------------------
47
+ 2023-10-17 10:18:58,583 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
48
+ - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
49
+ 2023-10-17 10:18:58,583 ----------------------------------------------------------------------------------------------------
50
+ 2023-10-17 10:18:58,583 Train: 20847 sentences
51
+ 2023-10-17 10:18:58,583 (train_with_dev=False, train_with_test=False)
52
+ 2023-10-17 10:18:58,583 ----------------------------------------------------------------------------------------------------
53
+ 2023-10-17 10:18:58,583 Training Params:
54
+ 2023-10-17 10:18:58,583 - learning_rate: "5e-05"
55
+ 2023-10-17 10:18:58,583 - mini_batch_size: "8"
56
+ 2023-10-17 10:18:58,583 - max_epochs: "10"
57
+ 2023-10-17 10:18:58,584 - shuffle: "True"
58
+ 2023-10-17 10:18:58,584 ----------------------------------------------------------------------------------------------------
59
+ 2023-10-17 10:18:58,584 Plugins:
60
+ 2023-10-17 10:18:58,584 - TensorboardLogger
61
+ 2023-10-17 10:18:58,584 - LinearScheduler | warmup_fraction: '0.1'
62
+ 2023-10-17 10:18:58,584 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-17 10:18:58,584 Final evaluation on model from best epoch (best-model.pt)
64
+ 2023-10-17 10:18:58,584 - metric: "('micro avg', 'f1-score')"
65
+ 2023-10-17 10:18:58,584 ----------------------------------------------------------------------------------------------------
66
+ 2023-10-17 10:18:58,584 Computation:
67
+ 2023-10-17 10:18:58,584 - compute on device: cuda:0
68
+ 2023-10-17 10:18:58,584 - embedding storage: none
69
+ 2023-10-17 10:18:58,584 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-17 10:18:58,584 Model training base path: "hmbench-newseye/de-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
71
+ 2023-10-17 10:18:58,584 ----------------------------------------------------------------------------------------------------
72
+ 2023-10-17 10:18:58,585 ----------------------------------------------------------------------------------------------------
73
+ 2023-10-17 10:18:58,585 Logging anything other than scalars to TensorBoard is currently not supported.
74
+ 2023-10-17 10:19:25,620 epoch 1 - iter 260/2606 - loss 1.88523681 - time (sec): 27.03 - samples/sec: 1244.18 - lr: 0.000005 - momentum: 0.000000
75
+ 2023-10-17 10:19:52,497 epoch 1 - iter 520/2606 - loss 1.12382223 - time (sec): 53.91 - samples/sec: 1283.76 - lr: 0.000010 - momentum: 0.000000
76
+ 2023-10-17 10:20:20,300 epoch 1 - iter 780/2606 - loss 0.83719157 - time (sec): 81.71 - samples/sec: 1310.34 - lr: 0.000015 - momentum: 0.000000
77
+ 2023-10-17 10:20:47,239 epoch 1 - iter 1040/2606 - loss 0.68970036 - time (sec): 108.65 - samples/sec: 1336.82 - lr: 0.000020 - momentum: 0.000000
78
+ 2023-10-17 10:21:14,600 epoch 1 - iter 1300/2606 - loss 0.59439782 - time (sec): 136.01 - samples/sec: 1353.21 - lr: 0.000025 - momentum: 0.000000
79
+ 2023-10-17 10:21:42,308 epoch 1 - iter 1560/2606 - loss 0.52755960 - time (sec): 163.72 - samples/sec: 1362.27 - lr: 0.000030 - momentum: 0.000000
80
+ 2023-10-17 10:22:10,369 epoch 1 - iter 1820/2606 - loss 0.48679454 - time (sec): 191.78 - samples/sec: 1346.47 - lr: 0.000035 - momentum: 0.000000
81
+ 2023-10-17 10:22:37,175 epoch 1 - iter 2080/2606 - loss 0.45521738 - time (sec): 218.59 - samples/sec: 1338.76 - lr: 0.000040 - momentum: 0.000000
82
+ 2023-10-17 10:23:04,173 epoch 1 - iter 2340/2606 - loss 0.42400062 - time (sec): 245.59 - samples/sec: 1345.15 - lr: 0.000045 - momentum: 0.000000
83
+ 2023-10-17 10:23:31,264 epoch 1 - iter 2600/2606 - loss 0.40097181 - time (sec): 272.68 - samples/sec: 1344.06 - lr: 0.000050 - momentum: 0.000000
84
+ 2023-10-17 10:23:31,864 ----------------------------------------------------------------------------------------------------
85
+ 2023-10-17 10:23:31,864 EPOCH 1 done: loss 0.4002 - lr: 0.000050
86
+ 2023-10-17 10:23:39,220 DEV : loss 0.11381553113460541 - f1-score (micro avg) 0.3171
87
+ 2023-10-17 10:23:39,272 saving best model
88
+ 2023-10-17 10:23:39,802 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-17 10:24:08,030 epoch 2 - iter 260/2606 - loss 0.17285748 - time (sec): 28.23 - samples/sec: 1355.61 - lr: 0.000049 - momentum: 0.000000
90
+ 2023-10-17 10:24:35,777 epoch 2 - iter 520/2606 - loss 0.20526777 - time (sec): 55.97 - samples/sec: 1329.33 - lr: 0.000049 - momentum: 0.000000
91
+ 2023-10-17 10:25:03,854 epoch 2 - iter 780/2606 - loss 0.19400558 - time (sec): 84.05 - samples/sec: 1333.96 - lr: 0.000048 - momentum: 0.000000
92
+ 2023-10-17 10:25:30,896 epoch 2 - iter 1040/2606 - loss 0.19145026 - time (sec): 111.09 - samples/sec: 1322.86 - lr: 0.000048 - momentum: 0.000000
93
+ 2023-10-17 10:25:59,200 epoch 2 - iter 1300/2606 - loss 0.18904313 - time (sec): 139.40 - samples/sec: 1313.04 - lr: 0.000047 - momentum: 0.000000
94
+ 2023-10-17 10:26:25,856 epoch 2 - iter 1560/2606 - loss 0.18594503 - time (sec): 166.05 - samples/sec: 1319.31 - lr: 0.000047 - momentum: 0.000000
95
+ 2023-10-17 10:26:54,154 epoch 2 - iter 1820/2606 - loss 0.17938094 - time (sec): 194.35 - samples/sec: 1331.79 - lr: 0.000046 - momentum: 0.000000
96
+ 2023-10-17 10:27:21,812 epoch 2 - iter 2080/2606 - loss 0.17717797 - time (sec): 222.01 - samples/sec: 1330.89 - lr: 0.000046 - momentum: 0.000000
97
+ 2023-10-17 10:27:47,173 epoch 2 - iter 2340/2606 - loss 0.17325202 - time (sec): 247.37 - samples/sec: 1329.74 - lr: 0.000045 - momentum: 0.000000
98
+ 2023-10-17 10:28:13,350 epoch 2 - iter 2600/2606 - loss 0.16972430 - time (sec): 273.55 - samples/sec: 1339.96 - lr: 0.000044 - momentum: 0.000000
99
+ 2023-10-17 10:28:14,078 ----------------------------------------------------------------------------------------------------
100
+ 2023-10-17 10:28:14,078 EPOCH 2 done: loss 0.1695 - lr: 0.000044
101
+ 2023-10-17 10:28:26,197 DEV : loss 0.223122239112854 - f1-score (micro avg) 0.3392
102
+ 2023-10-17 10:28:26,258 saving best model
103
+ 2023-10-17 10:28:27,707 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-17 10:28:56,388 epoch 3 - iter 260/2606 - loss 0.11196216 - time (sec): 28.68 - samples/sec: 1321.95 - lr: 0.000044 - momentum: 0.000000
105
+ 2023-10-17 10:29:23,668 epoch 3 - iter 520/2606 - loss 0.11653677 - time (sec): 55.96 - samples/sec: 1336.99 - lr: 0.000043 - momentum: 0.000000
106
+ 2023-10-17 10:29:50,270 epoch 3 - iter 780/2606 - loss 0.11963140 - time (sec): 82.56 - samples/sec: 1343.12 - lr: 0.000043 - momentum: 0.000000
107
+ 2023-10-17 10:30:17,420 epoch 3 - iter 1040/2606 - loss 0.11700179 - time (sec): 109.71 - samples/sec: 1335.04 - lr: 0.000042 - momentum: 0.000000
108
+ 2023-10-17 10:30:44,113 epoch 3 - iter 1300/2606 - loss 0.11677412 - time (sec): 136.40 - samples/sec: 1341.76 - lr: 0.000042 - momentum: 0.000000
109
+ 2023-10-17 10:31:10,804 epoch 3 - iter 1560/2606 - loss 0.12086406 - time (sec): 163.09 - samples/sec: 1333.15 - lr: 0.000041 - momentum: 0.000000
110
+ 2023-10-17 10:31:37,531 epoch 3 - iter 1820/2606 - loss 0.11878217 - time (sec): 189.82 - samples/sec: 1335.69 - lr: 0.000041 - momentum: 0.000000
111
+ 2023-10-17 10:32:05,066 epoch 3 - iter 2080/2606 - loss 0.11924890 - time (sec): 217.36 - samples/sec: 1335.00 - lr: 0.000040 - momentum: 0.000000
112
+ 2023-10-17 10:32:34,760 epoch 3 - iter 2340/2606 - loss 0.11857763 - time (sec): 247.05 - samples/sec: 1333.10 - lr: 0.000039 - momentum: 0.000000
113
+ 2023-10-17 10:33:03,869 epoch 3 - iter 2600/2606 - loss 0.11654712 - time (sec): 276.16 - samples/sec: 1327.20 - lr: 0.000039 - momentum: 0.000000
114
+ 2023-10-17 10:33:04,518 ----------------------------------------------------------------------------------------------------
115
+ 2023-10-17 10:33:04,518 EPOCH 3 done: loss 0.1163 - lr: 0.000039
116
+ 2023-10-17 10:33:16,361 DEV : loss 0.18563708662986755 - f1-score (micro avg) 0.356
117
+ 2023-10-17 10:33:16,417 saving best model
118
+ 2023-10-17 10:33:17,825 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-17 10:33:45,981 epoch 4 - iter 260/2606 - loss 0.08809572 - time (sec): 28.15 - samples/sec: 1322.94 - lr: 0.000038 - momentum: 0.000000
120
+ 2023-10-17 10:34:12,346 epoch 4 - iter 520/2606 - loss 0.08738428 - time (sec): 54.52 - samples/sec: 1348.70 - lr: 0.000038 - momentum: 0.000000
121
+ 2023-10-17 10:34:37,823 epoch 4 - iter 780/2606 - loss 0.08778704 - time (sec): 79.99 - samples/sec: 1355.87 - lr: 0.000037 - momentum: 0.000000
122
+ 2023-10-17 10:35:04,756 epoch 4 - iter 1040/2606 - loss 0.08611474 - time (sec): 106.93 - samples/sec: 1346.45 - lr: 0.000037 - momentum: 0.000000
123
+ 2023-10-17 10:35:30,915 epoch 4 - iter 1300/2606 - loss 0.08825421 - time (sec): 133.09 - samples/sec: 1343.11 - lr: 0.000036 - momentum: 0.000000
124
+ 2023-10-17 10:35:56,677 epoch 4 - iter 1560/2606 - loss 0.08795449 - time (sec): 158.85 - samples/sec: 1342.10 - lr: 0.000036 - momentum: 0.000000
125
+ 2023-10-17 10:36:24,361 epoch 4 - iter 1820/2606 - loss 0.08855088 - time (sec): 186.53 - samples/sec: 1348.57 - lr: 0.000035 - momentum: 0.000000
126
+ 2023-10-17 10:36:51,682 epoch 4 - iter 2080/2606 - loss 0.08701037 - time (sec): 213.85 - samples/sec: 1356.64 - lr: 0.000034 - momentum: 0.000000
127
+ 2023-10-17 10:37:19,341 epoch 4 - iter 2340/2606 - loss 0.08716051 - time (sec): 241.51 - samples/sec: 1361.96 - lr: 0.000034 - momentum: 0.000000
128
+ 2023-10-17 10:37:47,256 epoch 4 - iter 2600/2606 - loss 0.08575731 - time (sec): 269.43 - samples/sec: 1361.07 - lr: 0.000033 - momentum: 0.000000
129
+ 2023-10-17 10:37:47,818 ----------------------------------------------------------------------------------------------------
130
+ 2023-10-17 10:37:47,818 EPOCH 4 done: loss 0.0856 - lr: 0.000033
131
+ 2023-10-17 10:37:58,718 DEV : loss 0.2579371929168701 - f1-score (micro avg) 0.3733
132
+ 2023-10-17 10:37:58,776 saving best model
133
+ 2023-10-17 10:38:00,188 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-17 10:38:27,739 epoch 5 - iter 260/2606 - loss 0.05100938 - time (sec): 27.55 - samples/sec: 1367.58 - lr: 0.000033 - momentum: 0.000000
135
+ 2023-10-17 10:38:53,642 epoch 5 - iter 520/2606 - loss 0.04902557 - time (sec): 53.45 - samples/sec: 1337.47 - lr: 0.000032 - momentum: 0.000000
136
+ 2023-10-17 10:39:20,996 epoch 5 - iter 780/2606 - loss 0.05197574 - time (sec): 80.80 - samples/sec: 1340.12 - lr: 0.000032 - momentum: 0.000000
137
+ 2023-10-17 10:39:47,084 epoch 5 - iter 1040/2606 - loss 0.05396406 - time (sec): 106.89 - samples/sec: 1332.05 - lr: 0.000031 - momentum: 0.000000
138
+ 2023-10-17 10:40:16,075 epoch 5 - iter 1300/2606 - loss 0.05874220 - time (sec): 135.88 - samples/sec: 1339.43 - lr: 0.000031 - momentum: 0.000000
139
+ 2023-10-17 10:40:43,125 epoch 5 - iter 1560/2606 - loss 0.05952086 - time (sec): 162.93 - samples/sec: 1361.63 - lr: 0.000030 - momentum: 0.000000
140
+ 2023-10-17 10:41:09,653 epoch 5 - iter 1820/2606 - loss 0.06088136 - time (sec): 189.46 - samples/sec: 1363.89 - lr: 0.000029 - momentum: 0.000000
141
+ 2023-10-17 10:41:36,811 epoch 5 - iter 2080/2606 - loss 0.06080196 - time (sec): 216.62 - samples/sec: 1365.76 - lr: 0.000029 - momentum: 0.000000
142
+ 2023-10-17 10:42:02,485 epoch 5 - iter 2340/2606 - loss 0.06099854 - time (sec): 242.29 - samples/sec: 1363.26 - lr: 0.000028 - momentum: 0.000000
143
+ 2023-10-17 10:42:29,443 epoch 5 - iter 2600/2606 - loss 0.06045113 - time (sec): 269.25 - samples/sec: 1362.05 - lr: 0.000028 - momentum: 0.000000
144
+ 2023-10-17 10:42:29,970 ----------------------------------------------------------------------------------------------------
145
+ 2023-10-17 10:42:29,970 EPOCH 5 done: loss 0.0604 - lr: 0.000028
146
+ 2023-10-17 10:42:40,709 DEV : loss 0.3068985044956207 - f1-score (micro avg) 0.4156
147
+ 2023-10-17 10:42:40,761 saving best model
148
+ 2023-10-17 10:42:42,155 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-17 10:43:08,497 epoch 6 - iter 260/2606 - loss 0.04273978 - time (sec): 26.34 - samples/sec: 1415.83 - lr: 0.000027 - momentum: 0.000000
150
+ 2023-10-17 10:43:34,516 epoch 6 - iter 520/2606 - loss 0.04270582 - time (sec): 52.36 - samples/sec: 1379.74 - lr: 0.000027 - momentum: 0.000000
151
+ 2023-10-17 10:44:00,440 epoch 6 - iter 780/2606 - loss 0.04178911 - time (sec): 78.28 - samples/sec: 1368.55 - lr: 0.000026 - momentum: 0.000000
152
+ 2023-10-17 10:44:27,949 epoch 6 - iter 1040/2606 - loss 0.03962571 - time (sec): 105.79 - samples/sec: 1385.07 - lr: 0.000026 - momentum: 0.000000
153
+ 2023-10-17 10:44:55,668 epoch 6 - iter 1300/2606 - loss 0.04045489 - time (sec): 133.51 - samples/sec: 1387.29 - lr: 0.000025 - momentum: 0.000000
154
+ 2023-10-17 10:45:22,831 epoch 6 - iter 1560/2606 - loss 0.04097251 - time (sec): 160.67 - samples/sec: 1383.91 - lr: 0.000024 - momentum: 0.000000
155
+ 2023-10-17 10:45:50,498 epoch 6 - iter 1820/2606 - loss 0.04012529 - time (sec): 188.34 - samples/sec: 1371.39 - lr: 0.000024 - momentum: 0.000000
156
+ 2023-10-17 10:46:17,568 epoch 6 - iter 2080/2606 - loss 0.04115226 - time (sec): 215.41 - samples/sec: 1363.74 - lr: 0.000023 - momentum: 0.000000
157
+ 2023-10-17 10:46:46,617 epoch 6 - iter 2340/2606 - loss 0.04163284 - time (sec): 244.46 - samples/sec: 1352.33 - lr: 0.000023 - momentum: 0.000000
158
+ 2023-10-17 10:47:13,563 epoch 6 - iter 2600/2606 - loss 0.04297284 - time (sec): 271.40 - samples/sec: 1350.77 - lr: 0.000022 - momentum: 0.000000
159
+ 2023-10-17 10:47:14,120 ----------------------------------------------------------------------------------------------------
160
+ 2023-10-17 10:47:14,120 EPOCH 6 done: loss 0.0429 - lr: 0.000022
161
+ 2023-10-17 10:47:24,944 DEV : loss 0.3329330384731293 - f1-score (micro avg) 0.3972
162
+ 2023-10-17 10:47:24,995 ----------------------------------------------------------------------------------------------------
163
+ 2023-10-17 10:47:52,020 epoch 7 - iter 260/2606 - loss 0.03145306 - time (sec): 27.02 - samples/sec: 1399.79 - lr: 0.000022 - momentum: 0.000000
164
+ 2023-10-17 10:48:19,096 epoch 7 - iter 520/2606 - loss 0.02828661 - time (sec): 54.10 - samples/sec: 1391.68 - lr: 0.000021 - momentum: 0.000000
165
+ 2023-10-17 10:48:46,758 epoch 7 - iter 780/2606 - loss 0.02853558 - time (sec): 81.76 - samples/sec: 1374.54 - lr: 0.000021 - momentum: 0.000000
166
+ 2023-10-17 10:49:16,621 epoch 7 - iter 1040/2606 - loss 0.02992094 - time (sec): 111.62 - samples/sec: 1335.70 - lr: 0.000020 - momentum: 0.000000
167
+ 2023-10-17 10:49:44,050 epoch 7 - iter 1300/2606 - loss 0.03061991 - time (sec): 139.05 - samples/sec: 1327.08 - lr: 0.000019 - momentum: 0.000000
168
+ 2023-10-17 10:50:10,681 epoch 7 - iter 1560/2606 - loss 0.02982746 - time (sec): 165.68 - samples/sec: 1326.21 - lr: 0.000019 - momentum: 0.000000
169
+ 2023-10-17 10:50:37,937 epoch 7 - iter 1820/2606 - loss 0.03066087 - time (sec): 192.94 - samples/sec: 1325.75 - lr: 0.000018 - momentum: 0.000000
170
+ 2023-10-17 10:51:06,450 epoch 7 - iter 2080/2606 - loss 0.03008073 - time (sec): 221.45 - samples/sec: 1332.50 - lr: 0.000018 - momentum: 0.000000
171
+ 2023-10-17 10:51:34,169 epoch 7 - iter 2340/2606 - loss 0.03142644 - time (sec): 249.17 - samples/sec: 1331.26 - lr: 0.000017 - momentum: 0.000000
172
+ 2023-10-17 10:52:00,747 epoch 7 - iter 2600/2606 - loss 0.03141193 - time (sec): 275.75 - samples/sec: 1330.53 - lr: 0.000017 - momentum: 0.000000
173
+ 2023-10-17 10:52:01,309 ----------------------------------------------------------------------------------------------------
174
+ 2023-10-17 10:52:01,309 EPOCH 7 done: loss 0.0314 - lr: 0.000017
175
+ 2023-10-17 10:52:12,353 DEV : loss 0.363223135471344 - f1-score (micro avg) 0.4096
176
+ 2023-10-17 10:52:12,417 ----------------------------------------------------------------------------------------------------
177
+ 2023-10-17 10:52:39,557 epoch 8 - iter 260/2606 - loss 0.01939361 - time (sec): 27.14 - samples/sec: 1321.47 - lr: 0.000016 - momentum: 0.000000
178
+ 2023-10-17 10:53:07,789 epoch 8 - iter 520/2606 - loss 0.01739651 - time (sec): 55.37 - samples/sec: 1290.53 - lr: 0.000016 - momentum: 0.000000
179
+ 2023-10-17 10:53:36,222 epoch 8 - iter 780/2606 - loss 0.01873777 - time (sec): 83.80 - samples/sec: 1267.00 - lr: 0.000015 - momentum: 0.000000
180
+ 2023-10-17 10:54:05,116 epoch 8 - iter 1040/2606 - loss 0.01864960 - time (sec): 112.70 - samples/sec: 1263.19 - lr: 0.000014 - momentum: 0.000000
181
+ 2023-10-17 10:54:32,882 epoch 8 - iter 1300/2606 - loss 0.01961418 - time (sec): 140.46 - samples/sec: 1267.02 - lr: 0.000014 - momentum: 0.000000
182
+ 2023-10-17 10:55:00,732 epoch 8 - iter 1560/2606 - loss 0.01972925 - time (sec): 168.31 - samples/sec: 1280.96 - lr: 0.000013 - momentum: 0.000000
183
+ 2023-10-17 10:55:28,259 epoch 8 - iter 1820/2606 - loss 0.01993508 - time (sec): 195.84 - samples/sec: 1291.10 - lr: 0.000013 - momentum: 0.000000
184
+ 2023-10-17 10:55:55,992 epoch 8 - iter 2080/2606 - loss 0.02035476 - time (sec): 223.57 - samples/sec: 1305.07 - lr: 0.000012 - momentum: 0.000000
185
+ 2023-10-17 10:56:23,834 epoch 8 - iter 2340/2606 - loss 0.02030965 - time (sec): 251.41 - samples/sec: 1312.25 - lr: 0.000012 - momentum: 0.000000
186
+ 2023-10-17 10:56:52,233 epoch 8 - iter 2600/2606 - loss 0.02034546 - time (sec): 279.81 - samples/sec: 1310.76 - lr: 0.000011 - momentum: 0.000000
187
+ 2023-10-17 10:56:52,873 ----------------------------------------------------------------------------------------------------
188
+ 2023-10-17 10:56:52,874 EPOCH 8 done: loss 0.0203 - lr: 0.000011
189
+ 2023-10-17 10:57:05,587 DEV : loss 0.393916517496109 - f1-score (micro avg) 0.406
190
+ 2023-10-17 10:57:05,657 ----------------------------------------------------------------------------------------------------
191
+ 2023-10-17 10:57:32,201 epoch 9 - iter 260/2606 - loss 0.00999242 - time (sec): 26.54 - samples/sec: 1281.27 - lr: 0.000011 - momentum: 0.000000
192
+ 2023-10-17 10:57:59,545 epoch 9 - iter 520/2606 - loss 0.01555837 - time (sec): 53.89 - samples/sec: 1322.90 - lr: 0.000010 - momentum: 0.000000
193
+ 2023-10-17 10:58:26,352 epoch 9 - iter 780/2606 - loss 0.01445469 - time (sec): 80.69 - samples/sec: 1300.78 - lr: 0.000009 - momentum: 0.000000
194
+ 2023-10-17 10:58:54,326 epoch 9 - iter 1040/2606 - loss 0.01476011 - time (sec): 108.67 - samples/sec: 1284.97 - lr: 0.000009 - momentum: 0.000000
195
+ 2023-10-17 10:59:21,870 epoch 9 - iter 1300/2606 - loss 0.01578290 - time (sec): 136.21 - samples/sec: 1293.88 - lr: 0.000008 - momentum: 0.000000
196
+ 2023-10-17 10:59:50,972 epoch 9 - iter 1560/2606 - loss 0.01579451 - time (sec): 165.31 - samples/sec: 1287.08 - lr: 0.000008 - momentum: 0.000000
197
+ 2023-10-17 11:00:20,149 epoch 9 - iter 1820/2606 - loss 0.01527993 - time (sec): 194.49 - samples/sec: 1286.46 - lr: 0.000007 - momentum: 0.000000
198
+ 2023-10-17 11:00:49,010 epoch 9 - iter 2080/2606 - loss 0.01478612 - time (sec): 223.35 - samples/sec: 1291.50 - lr: 0.000007 - momentum: 0.000000
199
+ 2023-10-17 11:01:17,393 epoch 9 - iter 2340/2606 - loss 0.01475537 - time (sec): 251.73 - samples/sec: 1297.41 - lr: 0.000006 - momentum: 0.000000
200
+ 2023-10-17 11:01:46,450 epoch 9 - iter 2600/2606 - loss 0.01458389 - time (sec): 280.79 - samples/sec: 1305.91 - lr: 0.000006 - momentum: 0.000000
201
+ 2023-10-17 11:01:47,114 ----------------------------------------------------------------------------------------------------
202
+ 2023-10-17 11:01:47,114 EPOCH 9 done: loss 0.0147 - lr: 0.000006
203
+ 2023-10-17 11:02:00,212 DEV : loss 0.5024428367614746 - f1-score (micro avg) 0.3891
204
+ 2023-10-17 11:02:00,276 ----------------------------------------------------------------------------------------------------
205
+ 2023-10-17 11:02:28,755 epoch 10 - iter 260/2606 - loss 0.00777663 - time (sec): 28.48 - samples/sec: 1304.69 - lr: 0.000005 - momentum: 0.000000
206
+ 2023-10-17 11:02:56,881 epoch 10 - iter 520/2606 - loss 0.00852951 - time (sec): 56.60 - samples/sec: 1283.24 - lr: 0.000004 - momentum: 0.000000
207
+ 2023-10-17 11:03:24,114 epoch 10 - iter 780/2606 - loss 0.00914261 - time (sec): 83.84 - samples/sec: 1274.15 - lr: 0.000004 - momentum: 0.000000
208
+ 2023-10-17 11:03:50,849 epoch 10 - iter 1040/2606 - loss 0.00961380 - time (sec): 110.57 - samples/sec: 1302.75 - lr: 0.000003 - momentum: 0.000000
209
+ 2023-10-17 11:04:20,070 epoch 10 - iter 1300/2606 - loss 0.00995380 - time (sec): 139.79 - samples/sec: 1298.95 - lr: 0.000003 - momentum: 0.000000
210
+ 2023-10-17 11:04:47,597 epoch 10 - iter 1560/2606 - loss 0.00946199 - time (sec): 167.32 - samples/sec: 1295.54 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-17 11:05:14,358 epoch 10 - iter 1820/2606 - loss 0.00930366 - time (sec): 194.08 - samples/sec: 1303.79 - lr: 0.000002 - momentum: 0.000000
212
+ 2023-10-17 11:05:41,338 epoch 10 - iter 2080/2606 - loss 0.00918369 - time (sec): 221.06 - samples/sec: 1310.36 - lr: 0.000001 - momentum: 0.000000
213
+ 2023-10-17 11:06:09,838 epoch 10 - iter 2340/2606 - loss 0.00912808 - time (sec): 249.56 - samples/sec: 1315.21 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-17 11:06:39,144 epoch 10 - iter 2600/2606 - loss 0.00928698 - time (sec): 278.87 - samples/sec: 1315.25 - lr: 0.000000 - momentum: 0.000000
215
+ 2023-10-17 11:06:39,685 ----------------------------------------------------------------------------------------------------
216
+ 2023-10-17 11:06:39,685 EPOCH 10 done: loss 0.0093 - lr: 0.000000
217
+ 2023-10-17 11:06:51,588 DEV : loss 0.5263164043426514 - f1-score (micro avg) 0.3942
218
+ 2023-10-17 11:06:52,178 ----------------------------------------------------------------------------------------------------
219
+ 2023-10-17 11:06:52,180 Loading model from best epoch ...
220
+ 2023-10-17 11:06:54,538 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
221
+ 2023-10-17 11:07:15,105
222
+ Results:
223
+ - F-score (micro) 0.4845
224
+ - F-score (macro) 0.3222
225
+ - Accuracy 0.3241
226
+
227
+ By class:
228
+ precision recall f1-score support
229
+
230
+ LOC 0.5629 0.5972 0.5795 1214
231
+ PER 0.4140 0.4406 0.4269 808
232
+ ORG 0.3077 0.2606 0.2822 353
233
+ HumanProd 0.0000 0.0000 0.0000 15
234
+
235
+ micro avg 0.4784 0.4908 0.4845 2390
236
+ macro avg 0.3211 0.3246 0.3222 2390
237
+ weighted avg 0.4713 0.4908 0.4804 2390
238
+
239
+ 2023-10-17 11:07:15,105 ----------------------------------------------------------------------------------------------------