File size: 96,475 Bytes
75f7dd8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
2021-12-31 08:35:07,676 ----------------------------------------------------------------------------------------------------
2021-12-31 08:35:07,680 Model: "SequenceTagger(
  (embeddings): StackedEmbeddings(
    (list_embedding_0): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.5, inplace=False)
        (encoder): Embedding(275, 100)
        (rnn): LSTM(100, 1024)
        (decoder): Linear(in_features=1024, out_features=275, bias=True)
      )
    )
    (list_embedding_1): FlairEmbeddings(
      (lm): LanguageModel(
        (drop): Dropout(p=0.5, inplace=False)
        (encoder): Embedding(275, 100)
        (rnn): LSTM(100, 1024)
        (decoder): Linear(in_features=1024, out_features=275, bias=True)
      )
    )
    (list_embedding_2): TransformerWordEmbeddings(
      (model): CamembertModel(
        (embeddings): RobertaEmbeddings(
          (word_embeddings): Embedding(32005, 768, padding_idx=1)
          (position_embeddings): Embedding(514, 768, padding_idx=1)
          (token_type_embeddings): Embedding(1, 768)
          (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
          (dropout): Dropout(p=0.1, inplace=False)
        )
        (encoder): RobertaEncoder(
          (layer): ModuleList(
            (0): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (1): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (2): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (3): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (4): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (5): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (6): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (7): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (8): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (9): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (10): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (11): RobertaLayer(
              (attention): RobertaAttention(
                (self): RobertaSelfAttention(
                  (query): Linear(in_features=768, out_features=768, bias=True)
                  (key): Linear(in_features=768, out_features=768, bias=True)
                  (value): Linear(in_features=768, out_features=768, bias=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
                (output): RobertaSelfOutput(
                  (dense): Linear(in_features=768, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
              (intermediate): RobertaIntermediate(
                (dense): Linear(in_features=768, out_features=3072, bias=True)
              )
              (output): RobertaOutput(
                (dense): Linear(in_features=3072, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
          )
        )
        (pooler): RobertaPooler(
          (dense): Linear(in_features=768, out_features=768, bias=True)
          (activation): Tanh()
        )
      )
    )
  )
  (word_dropout): WordDropout(p=0.05)
  (locked_dropout): LockedDropout(p=0.5)
  (embedding2nn): Linear(in_features=2816, out_features=2816, bias=True)
  (rnn): LSTM(2816, 256, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=512, out_features=68, bias=True)
  (beta): 1.0
  (weights): None
  (weight_tensor) None
)"
2021-12-31 08:35:07,680 ----------------------------------------------------------------------------------------------------
2021-12-31 08:35:07,681 Corpus: "Corpus: 14449 train + 1476 dev + 416 test sentences"
2021-12-31 08:35:07,681 ----------------------------------------------------------------------------------------------------
2021-12-31 08:35:07,681 Parameters:
2021-12-31 08:35:07,681  - learning_rate: "0.1"
2021-12-31 08:35:07,681  - mini_batch_size: "8"
2021-12-31 08:35:07,681  - patience: "3"
2021-12-31 08:35:07,681  - anneal_factor: "0.5"
2021-12-31 08:35:07,681  - max_epochs: "50"
2021-12-31 08:35:07,681  - shuffle: "True"
2021-12-31 08:35:07,681  - train_with_dev: "False"
2021-12-31 08:35:07,681  - batch_growth_annealing: "False"
2021-12-31 08:35:07,681 ----------------------------------------------------------------------------------------------------
2021-12-31 08:35:07,681 Model training base path: "models/UPOS_UD_FRENCH_GSD_PLUS_Flair-Embeddings_50_2021-12-31-08:34:44"
2021-12-31 08:35:07,681 ----------------------------------------------------------------------------------------------------
2021-12-31 08:35:07,682 Device: cuda:0
2021-12-31 08:35:07,682 ----------------------------------------------------------------------------------------------------
2021-12-31 08:35:07,682 Embeddings storage mode: cpu
2021-12-31 08:35:07,686 ----------------------------------------------------------------------------------------------------
2021-12-31 08:35:35,600 epoch 1 - iter 180/1807 - loss 1.43338722 - samples/sec: 51.63 - lr: 0.100000
2021-12-31 08:36:03,642 epoch 1 - iter 360/1807 - loss 0.97278560 - samples/sec: 51.39 - lr: 0.100000
2021-12-31 08:36:31,448 epoch 1 - iter 540/1807 - loss 0.77628898 - samples/sec: 51.83 - lr: 0.100000
2021-12-31 08:37:00,007 epoch 1 - iter 720/1807 - loss 0.66122431 - samples/sec: 50.46 - lr: 0.100000
2021-12-31 08:37:29,449 epoch 1 - iter 900/1807 - loss 0.58637716 - samples/sec: 48.94 - lr: 0.100000
2021-12-31 08:37:57,842 epoch 1 - iter 1080/1807 - loss 0.53261867 - samples/sec: 50.75 - lr: 0.100000
2021-12-31 08:38:27,836 epoch 1 - iter 1260/1807 - loss 0.49236809 - samples/sec: 48.04 - lr: 0.100000
2021-12-31 08:38:56,177 epoch 1 - iter 1440/1807 - loss 0.46224064 - samples/sec: 50.84 - lr: 0.100000
2021-12-31 08:39:25,301 epoch 1 - iter 1620/1807 - loss 0.43700232 - samples/sec: 49.48 - lr: 0.100000
2021-12-31 08:39:53,843 epoch 1 - iter 1800/1807 - loss 0.41459922 - samples/sec: 50.49 - lr: 0.100000
2021-12-31 08:39:54,850 ----------------------------------------------------------------------------------------------------
2021-12-31 08:39:54,851 EPOCH 1 done: loss 0.4139 - lr 0.1000000
2021-12-31 08:40:38,186 DEV : loss 0.09867297857999802 - f1-score (micro avg)  0.9723
2021-12-31 08:40:38,373 BAD EPOCHS (no improvement): 0
2021-12-31 08:40:38,375 saving best model
2021-12-31 08:40:43,945 ----------------------------------------------------------------------------------------------------
2021-12-31 08:40:59,809 epoch 2 - iter 180/1807 - loss 0.20282785 - samples/sec: 90.92 - lr: 0.100000
2021-12-31 08:41:15,798 epoch 2 - iter 360/1807 - loss 0.20600484 - samples/sec: 90.20 - lr: 0.100000
2021-12-31 08:41:31,824 epoch 2 - iter 540/1807 - loss 0.20352355 - samples/sec: 89.99 - lr: 0.100000
2021-12-31 08:41:47,291 epoch 2 - iter 720/1807 - loss 0.19945298 - samples/sec: 93.24 - lr: 0.100000
2021-12-31 08:42:03,389 epoch 2 - iter 900/1807 - loss 0.19672769 - samples/sec: 89.58 - lr: 0.100000
2021-12-31 08:42:19,546 epoch 2 - iter 1080/1807 - loss 0.19404584 - samples/sec: 89.25 - lr: 0.100000
2021-12-31 08:42:35,186 epoch 2 - iter 1260/1807 - loss 0.19211776 - samples/sec: 92.22 - lr: 0.100000
2021-12-31 08:42:51,014 epoch 2 - iter 1440/1807 - loss 0.19040930 - samples/sec: 91.11 - lr: 0.100000
2021-12-31 08:43:07,108 epoch 2 - iter 1620/1807 - loss 0.18835936 - samples/sec: 89.60 - lr: 0.100000
2021-12-31 08:43:22,664 epoch 2 - iter 1800/1807 - loss 0.18684498 - samples/sec: 92.71 - lr: 0.100000
2021-12-31 08:43:23,166 ----------------------------------------------------------------------------------------------------
2021-12-31 08:43:23,166 EPOCH 2 done: loss 0.1868 - lr 0.1000000
2021-12-31 08:43:59,411 DEV : loss 0.08219591528177261 - f1-score (micro avg)  0.9761
2021-12-31 08:43:59,601 BAD EPOCHS (no improvement): 0
2021-12-31 08:43:59,602 saving best model
2021-12-31 08:44:04,994 ----------------------------------------------------------------------------------------------------
2021-12-31 08:44:21,188 epoch 3 - iter 180/1807 - loss 0.16248988 - samples/sec: 89.06 - lr: 0.100000
2021-12-31 08:44:37,143 epoch 3 - iter 360/1807 - loss 0.16012805 - samples/sec: 90.38 - lr: 0.100000
2021-12-31 08:44:53,240 epoch 3 - iter 540/1807 - loss 0.15771573 - samples/sec: 89.59 - lr: 0.100000
2021-12-31 08:45:08,820 epoch 3 - iter 720/1807 - loss 0.15678918 - samples/sec: 92.57 - lr: 0.100000
2021-12-31 08:45:24,447 epoch 3 - iter 900/1807 - loss 0.15583330 - samples/sec: 92.28 - lr: 0.100000
2021-12-31 08:45:40,453 epoch 3 - iter 1080/1807 - loss 0.15551694 - samples/sec: 90.10 - lr: 0.100000
2021-12-31 08:45:56,421 epoch 3 - iter 1260/1807 - loss 0.15503272 - samples/sec: 90.32 - lr: 0.100000
2021-12-31 08:46:12,207 epoch 3 - iter 1440/1807 - loss 0.15478837 - samples/sec: 91.35 - lr: 0.100000
2021-12-31 08:46:28,067 epoch 3 - iter 1620/1807 - loss 0.15437671 - samples/sec: 90.93 - lr: 0.100000
2021-12-31 08:46:44,096 epoch 3 - iter 1800/1807 - loss 0.15334210 - samples/sec: 89.96 - lr: 0.100000
2021-12-31 08:46:44,638 ----------------------------------------------------------------------------------------------------
2021-12-31 08:46:44,638 EPOCH 3 done: loss 0.1533 - lr 0.1000000
2021-12-31 08:47:19,364 DEV : loss 0.07821641117334366 - f1-score (micro avg)  0.9771
2021-12-31 08:47:19,574 BAD EPOCHS (no improvement): 0
2021-12-31 08:47:19,576 saving best model
2021-12-31 08:47:25,807 ----------------------------------------------------------------------------------------------------
2021-12-31 08:47:42,295 epoch 4 - iter 180/1807 - loss 0.14078583 - samples/sec: 87.48 - lr: 0.100000
2021-12-31 08:47:58,394 epoch 4 - iter 360/1807 - loss 0.14084079 - samples/sec: 89.58 - lr: 0.100000
2021-12-31 08:48:14,377 epoch 4 - iter 540/1807 - loss 0.13969043 - samples/sec: 90.22 - lr: 0.100000
2021-12-31 08:48:30,411 epoch 4 - iter 720/1807 - loss 0.13901425 - samples/sec: 89.95 - lr: 0.100000
2021-12-31 08:48:45,985 epoch 4 - iter 900/1807 - loss 0.13965987 - samples/sec: 92.60 - lr: 0.100000
2021-12-31 08:49:01,706 epoch 4 - iter 1080/1807 - loss 0.13942263 - samples/sec: 91.73 - lr: 0.100000
2021-12-31 08:49:17,833 epoch 4 - iter 1260/1807 - loss 0.13931213 - samples/sec: 89.42 - lr: 0.100000
2021-12-31 08:49:33,693 epoch 4 - iter 1440/1807 - loss 0.13835426 - samples/sec: 90.94 - lr: 0.100000
2021-12-31 08:49:49,444 epoch 4 - iter 1620/1807 - loss 0.13722078 - samples/sec: 91.56 - lr: 0.100000
2021-12-31 08:50:05,233 epoch 4 - iter 1800/1807 - loss 0.13680325 - samples/sec: 91.33 - lr: 0.100000
2021-12-31 08:50:05,825 ----------------------------------------------------------------------------------------------------
2021-12-31 08:50:05,826 EPOCH 4 done: loss 0.1368 - lr 0.1000000
2021-12-31 08:50:40,951 DEV : loss 0.07048774510622025 - f1-score (micro avg)  0.9784
2021-12-31 08:50:41,121 BAD EPOCHS (no improvement): 0
2021-12-31 08:50:41,123 saving best model
2021-12-31 08:50:46,985 ----------------------------------------------------------------------------------------------------
2021-12-31 08:51:03,480 epoch 5 - iter 180/1807 - loss 0.12576483 - samples/sec: 87.44 - lr: 0.100000
2021-12-31 08:51:19,312 epoch 5 - iter 360/1807 - loss 0.12838224 - samples/sec: 91.10 - lr: 0.100000
2021-12-31 08:51:35,140 epoch 5 - iter 540/1807 - loss 0.13027925 - samples/sec: 91.11 - lr: 0.100000
2021-12-31 08:51:51,382 epoch 5 - iter 720/1807 - loss 0.13001079 - samples/sec: 88.78 - lr: 0.100000
2021-12-31 08:52:07,009 epoch 5 - iter 900/1807 - loss 0.12990639 - samples/sec: 92.28 - lr: 0.100000
2021-12-31 08:52:22,749 epoch 5 - iter 1080/1807 - loss 0.12927608 - samples/sec: 91.63 - lr: 0.100000
2021-12-31 08:52:38,459 epoch 5 - iter 1260/1807 - loss 0.12839810 - samples/sec: 91.79 - lr: 0.100000
2021-12-31 08:52:54,183 epoch 5 - iter 1440/1807 - loss 0.12750076 - samples/sec: 91.71 - lr: 0.100000
2021-12-31 08:53:09,782 epoch 5 - iter 1620/1807 - loss 0.12744081 - samples/sec: 92.45 - lr: 0.100000
2021-12-31 08:53:26,181 epoch 5 - iter 1800/1807 - loss 0.12697954 - samples/sec: 87.94 - lr: 0.100000
2021-12-31 08:53:26,718 ----------------------------------------------------------------------------------------------------
2021-12-31 08:53:26,718 EPOCH 5 done: loss 0.1270 - lr 0.1000000
2021-12-31 08:54:05,303 DEV : loss 0.06857253611087799 - f1-score (micro avg)  0.9795
2021-12-31 08:54:05,490 BAD EPOCHS (no improvement): 0
2021-12-31 08:54:05,491 saving best model
2021-12-31 08:54:11,317 ----------------------------------------------------------------------------------------------------
2021-12-31 08:54:27,729 epoch 6 - iter 180/1807 - loss 0.12012197 - samples/sec: 87.88 - lr: 0.100000
2021-12-31 08:54:43,570 epoch 6 - iter 360/1807 - loss 0.12134345 - samples/sec: 91.04 - lr: 0.100000
2021-12-31 08:54:59,298 epoch 6 - iter 540/1807 - loss 0.12010472 - samples/sec: 91.70 - lr: 0.100000
2021-12-31 08:55:14,710 epoch 6 - iter 720/1807 - loss 0.11985671 - samples/sec: 93.58 - lr: 0.100000
2021-12-31 08:55:30,873 epoch 6 - iter 900/1807 - loss 0.12032070 - samples/sec: 89.22 - lr: 0.100000
2021-12-31 08:55:46,705 epoch 6 - iter 1080/1807 - loss 0.11976455 - samples/sec: 91.08 - lr: 0.100000
2021-12-31 08:56:02,915 epoch 6 - iter 1260/1807 - loss 0.11964832 - samples/sec: 88.97 - lr: 0.100000
2021-12-31 08:56:18,616 epoch 6 - iter 1440/1807 - loss 0.11958148 - samples/sec: 91.86 - lr: 0.100000
2021-12-31 08:56:34,478 epoch 6 - iter 1620/1807 - loss 0.12003314 - samples/sec: 90.91 - lr: 0.100000
2021-12-31 08:56:50,548 epoch 6 - iter 1800/1807 - loss 0.11950787 - samples/sec: 89.75 - lr: 0.100000
2021-12-31 08:56:51,070 ----------------------------------------------------------------------------------------------------
2021-12-31 08:56:51,070 EPOCH 6 done: loss 0.1195 - lr 0.1000000
2021-12-31 08:57:26,881 DEV : loss 0.06588418781757355 - f1-score (micro avg)  0.9805
2021-12-31 08:57:27,077 BAD EPOCHS (no improvement): 0
2021-12-31 08:57:27,079 saving best model
2021-12-31 08:57:32,878 ----------------------------------------------------------------------------------------------------
2021-12-31 08:57:49,222 epoch 7 - iter 180/1807 - loss 0.11622596 - samples/sec: 88.27 - lr: 0.100000
2021-12-31 08:58:05,154 epoch 7 - iter 360/1807 - loss 0.11182908 - samples/sec: 90.52 - lr: 0.100000
2021-12-31 08:58:21,316 epoch 7 - iter 540/1807 - loss 0.11325284 - samples/sec: 89.23 - lr: 0.100000
2021-12-31 08:58:37,501 epoch 7 - iter 720/1807 - loss 0.11356510 - samples/sec: 89.11 - lr: 0.100000
2021-12-31 08:58:53,437 epoch 7 - iter 900/1807 - loss 0.11375009 - samples/sec: 90.50 - lr: 0.100000
2021-12-31 08:59:09,683 epoch 7 - iter 1080/1807 - loss 0.11424006 - samples/sec: 88.76 - lr: 0.100000
2021-12-31 08:59:25,513 epoch 7 - iter 1260/1807 - loss 0.11502991 - samples/sec: 91.10 - lr: 0.100000
2021-12-31 08:59:41,355 epoch 7 - iter 1440/1807 - loss 0.11465724 - samples/sec: 91.04 - lr: 0.100000
2021-12-31 08:59:57,048 epoch 7 - iter 1620/1807 - loss 0.11489345 - samples/sec: 91.91 - lr: 0.100000
2021-12-31 09:00:13,626 epoch 7 - iter 1800/1807 - loss 0.11495780 - samples/sec: 86.99 - lr: 0.100000
2021-12-31 09:00:14,225 ----------------------------------------------------------------------------------------------------
2021-12-31 09:00:14,225 EPOCH 7 done: loss 0.1149 - lr 0.1000000
2021-12-31 09:00:50,356 DEV : loss 0.06450950354337692 - f1-score (micro avg)  0.981
2021-12-31 09:00:50,566 BAD EPOCHS (no improvement): 0
2021-12-31 09:00:50,572 saving best model
2021-12-31 09:00:56,353 ----------------------------------------------------------------------------------------------------
2021-12-31 09:01:12,703 epoch 8 - iter 180/1807 - loss 0.10372694 - samples/sec: 88.23 - lr: 0.100000
2021-12-31 09:01:28,785 epoch 8 - iter 360/1807 - loss 0.10507104 - samples/sec: 89.68 - lr: 0.100000
2021-12-31 09:01:45,134 epoch 8 - iter 540/1807 - loss 0.10666062 - samples/sec: 88.21 - lr: 0.100000
2021-12-31 09:02:01,507 epoch 8 - iter 720/1807 - loss 0.10750728 - samples/sec: 88.08 - lr: 0.100000
2021-12-31 09:02:17,626 epoch 8 - iter 900/1807 - loss 0.10760637 - samples/sec: 89.47 - lr: 0.100000
2021-12-31 09:02:33,374 epoch 8 - iter 1080/1807 - loss 0.10788257 - samples/sec: 91.58 - lr: 0.100000
2021-12-31 09:02:49,200 epoch 8 - iter 1260/1807 - loss 0.10808589 - samples/sec: 91.12 - lr: 0.100000
2021-12-31 09:03:05,738 epoch 8 - iter 1440/1807 - loss 0.10815170 - samples/sec: 87.20 - lr: 0.100000
2021-12-31 09:03:21,442 epoch 8 - iter 1620/1807 - loss 0.10840840 - samples/sec: 91.84 - lr: 0.100000
2021-12-31 09:03:37,709 epoch 8 - iter 1800/1807 - loss 0.10855634 - samples/sec: 88.66 - lr: 0.100000
2021-12-31 09:03:38,280 ----------------------------------------------------------------------------------------------------
2021-12-31 09:03:38,280 EPOCH 8 done: loss 0.1086 - lr 0.1000000
2021-12-31 09:04:17,043 DEV : loss 0.06390747427940369 - f1-score (micro avg)  0.9805
2021-12-31 09:04:17,194 BAD EPOCHS (no improvement): 1
2021-12-31 09:04:17,196 ----------------------------------------------------------------------------------------------------
2021-12-31 09:04:33,331 epoch 9 - iter 180/1807 - loss 0.10260778 - samples/sec: 89.39 - lr: 0.100000
2021-12-31 09:04:49,336 epoch 9 - iter 360/1807 - loss 0.10566575 - samples/sec: 90.11 - lr: 0.100000
2021-12-31 09:05:05,083 epoch 9 - iter 540/1807 - loss 0.10556216 - samples/sec: 91.59 - lr: 0.100000
2021-12-31 09:05:21,004 epoch 9 - iter 720/1807 - loss 0.10506801 - samples/sec: 90.58 - lr: 0.100000
2021-12-31 09:05:37,109 epoch 9 - iter 900/1807 - loss 0.10596338 - samples/sec: 89.54 - lr: 0.100000
2021-12-31 09:05:52,784 epoch 9 - iter 1080/1807 - loss 0.10577668 - samples/sec: 92.02 - lr: 0.100000
2021-12-31 09:06:08,937 epoch 9 - iter 1260/1807 - loss 0.10613509 - samples/sec: 89.28 - lr: 0.100000
2021-12-31 09:06:24,601 epoch 9 - iter 1440/1807 - loss 0.10637150 - samples/sec: 92.06 - lr: 0.100000
2021-12-31 09:06:40,409 epoch 9 - iter 1620/1807 - loss 0.10629708 - samples/sec: 91.23 - lr: 0.100000
2021-12-31 09:06:55,972 epoch 9 - iter 1800/1807 - loss 0.10610710 - samples/sec: 92.67 - lr: 0.100000
2021-12-31 09:06:56,557 ----------------------------------------------------------------------------------------------------
2021-12-31 09:06:56,557 EPOCH 9 done: loss 0.1061 - lr 0.1000000
2021-12-31 09:07:32,784 DEV : loss 0.06607701629400253 - f1-score (micro avg)  0.9814
2021-12-31 09:07:32,970 BAD EPOCHS (no improvement): 0
2021-12-31 09:07:32,972 saving best model
2021-12-31 09:07:38,755 ----------------------------------------------------------------------------------------------------
2021-12-31 09:07:55,004 epoch 10 - iter 180/1807 - loss 0.10366226 - samples/sec: 88.76 - lr: 0.100000
2021-12-31 09:08:11,104 epoch 10 - iter 360/1807 - loss 0.10828055 - samples/sec: 89.58 - lr: 0.100000
2021-12-31 09:08:26,748 epoch 10 - iter 540/1807 - loss 0.10589800 - samples/sec: 92.20 - lr: 0.100000
2021-12-31 09:08:42,772 epoch 10 - iter 720/1807 - loss 0.10467961 - samples/sec: 90.00 - lr: 0.100000
2021-12-31 09:08:58,992 epoch 10 - iter 900/1807 - loss 0.10355149 - samples/sec: 88.91 - lr: 0.100000
2021-12-31 09:09:14,753 epoch 10 - iter 1080/1807 - loss 0.10313717 - samples/sec: 91.50 - lr: 0.100000
2021-12-31 09:09:30,631 epoch 10 - iter 1260/1807 - loss 0.10353533 - samples/sec: 90.84 - lr: 0.100000
2021-12-31 09:09:46,654 epoch 10 - iter 1440/1807 - loss 0.10386166 - samples/sec: 90.02 - lr: 0.100000
2021-12-31 09:10:02,791 epoch 10 - iter 1620/1807 - loss 0.10346798 - samples/sec: 89.36 - lr: 0.100000
2021-12-31 09:10:18,970 epoch 10 - iter 1800/1807 - loss 0.10358051 - samples/sec: 89.14 - lr: 0.100000
2021-12-31 09:10:19,492 ----------------------------------------------------------------------------------------------------
2021-12-31 09:10:19,492 EPOCH 10 done: loss 0.1036 - lr 0.1000000
2021-12-31 09:10:55,557 DEV : loss 0.06536506861448288 - f1-score (micro avg)  0.9811
2021-12-31 09:10:55,753 BAD EPOCHS (no improvement): 1
2021-12-31 09:10:55,756 ----------------------------------------------------------------------------------------------------
2021-12-31 09:11:12,024 epoch 11 - iter 180/1807 - loss 0.10182872 - samples/sec: 88.66 - lr: 0.100000
2021-12-31 09:11:28,246 epoch 11 - iter 360/1807 - loss 0.10175535 - samples/sec: 88.90 - lr: 0.100000
2021-12-31 09:11:43,844 epoch 11 - iter 540/1807 - loss 0.10107946 - samples/sec: 92.46 - lr: 0.100000
2021-12-31 09:11:59,559 epoch 11 - iter 720/1807 - loss 0.10053922 - samples/sec: 91.77 - lr: 0.100000
2021-12-31 09:12:15,490 epoch 11 - iter 900/1807 - loss 0.10047028 - samples/sec: 90.54 - lr: 0.100000
2021-12-31 09:12:31,195 epoch 11 - iter 1080/1807 - loss 0.09993958 - samples/sec: 91.82 - lr: 0.100000
2021-12-31 09:12:47,013 epoch 11 - iter 1260/1807 - loss 0.09996914 - samples/sec: 91.17 - lr: 0.100000
2021-12-31 09:13:03,156 epoch 11 - iter 1440/1807 - loss 0.09980985 - samples/sec: 89.35 - lr: 0.100000
2021-12-31 09:13:18,852 epoch 11 - iter 1620/1807 - loss 0.09941318 - samples/sec: 91.88 - lr: 0.100000
2021-12-31 09:13:35,014 epoch 11 - iter 1800/1807 - loss 0.09934768 - samples/sec: 89.23 - lr: 0.100000
2021-12-31 09:13:35,650 ----------------------------------------------------------------------------------------------------
2021-12-31 09:13:35,650 EPOCH 11 done: loss 0.0993 - lr 0.1000000
2021-12-31 09:14:14,419 DEV : loss 0.06659943610429764 - f1-score (micro avg)  0.9811
2021-12-31 09:14:14,622 BAD EPOCHS (no improvement): 2
2021-12-31 09:14:14,623 ----------------------------------------------------------------------------------------------------
2021-12-31 09:14:30,892 epoch 12 - iter 180/1807 - loss 0.09334718 - samples/sec: 88.66 - lr: 0.100000
2021-12-31 09:14:46,737 epoch 12 - iter 360/1807 - loss 0.09477923 - samples/sec: 91.02 - lr: 0.100000
2021-12-31 09:15:02,926 epoch 12 - iter 540/1807 - loss 0.09677398 - samples/sec: 89.09 - lr: 0.100000
2021-12-31 09:15:19,177 epoch 12 - iter 720/1807 - loss 0.09825518 - samples/sec: 88.74 - lr: 0.100000
2021-12-31 09:15:34,958 epoch 12 - iter 900/1807 - loss 0.09910665 - samples/sec: 91.38 - lr: 0.100000
2021-12-31 09:15:51,056 epoch 12 - iter 1080/1807 - loss 0.09820501 - samples/sec: 89.59 - lr: 0.100000
2021-12-31 09:16:07,231 epoch 12 - iter 1260/1807 - loss 0.09858638 - samples/sec: 89.16 - lr: 0.100000
2021-12-31 09:16:22,988 epoch 12 - iter 1440/1807 - loss 0.09845736 - samples/sec: 91.52 - lr: 0.100000
2021-12-31 09:16:38,631 epoch 12 - iter 1620/1807 - loss 0.09859390 - samples/sec: 92.21 - lr: 0.100000
2021-12-31 09:16:54,209 epoch 12 - iter 1800/1807 - loss 0.09847298 - samples/sec: 92.58 - lr: 0.100000
2021-12-31 09:16:54,729 ----------------------------------------------------------------------------------------------------
2021-12-31 09:16:54,730 EPOCH 12 done: loss 0.0984 - lr 0.1000000
2021-12-31 09:17:31,308 DEV : loss 0.06410104781389236 - f1-score (micro avg)  0.9816
2021-12-31 09:17:31,487 BAD EPOCHS (no improvement): 0
2021-12-31 09:17:31,489 saving best model
2021-12-31 09:17:37,260 ----------------------------------------------------------------------------------------------------
2021-12-31 09:17:54,060 epoch 13 - iter 180/1807 - loss 0.10013605 - samples/sec: 85.86 - lr: 0.100000
2021-12-31 09:18:09,827 epoch 13 - iter 360/1807 - loss 0.09881566 - samples/sec: 91.47 - lr: 0.100000
2021-12-31 09:18:25,218 epoch 13 - iter 540/1807 - loss 0.09860664 - samples/sec: 93.71 - lr: 0.100000
2021-12-31 09:18:41,246 epoch 13 - iter 720/1807 - loss 0.09768065 - samples/sec: 89.97 - lr: 0.100000
2021-12-31 09:18:57,306 epoch 13 - iter 900/1807 - loss 0.09766501 - samples/sec: 89.79 - lr: 0.100000
2021-12-31 09:19:12,914 epoch 13 - iter 1080/1807 - loss 0.09767968 - samples/sec: 92.41 - lr: 0.100000
2021-12-31 09:19:29,144 epoch 13 - iter 1260/1807 - loss 0.09667902 - samples/sec: 88.86 - lr: 0.100000
2021-12-31 09:19:45,573 epoch 13 - iter 1440/1807 - loss 0.09670686 - samples/sec: 87.78 - lr: 0.100000
2021-12-31 09:20:01,566 epoch 13 - iter 1620/1807 - loss 0.09672936 - samples/sec: 90.18 - lr: 0.100000
2021-12-31 09:20:17,572 epoch 13 - iter 1800/1807 - loss 0.09666135 - samples/sec: 90.10 - lr: 0.100000
2021-12-31 09:20:18,200 ----------------------------------------------------------------------------------------------------
2021-12-31 09:20:18,200 EPOCH 13 done: loss 0.0967 - lr 0.1000000
2021-12-31 09:20:54,147 DEV : loss 0.06427688896656036 - f1-score (micro avg)  0.9816
2021-12-31 09:20:54,334 BAD EPOCHS (no improvement): 1
2021-12-31 09:20:54,335 ----------------------------------------------------------------------------------------------------
2021-12-31 09:21:10,174 epoch 14 - iter 180/1807 - loss 0.09391481 - samples/sec: 91.06 - lr: 0.100000
2021-12-31 09:21:26,400 epoch 14 - iter 360/1807 - loss 0.09267418 - samples/sec: 88.88 - lr: 0.100000
2021-12-31 09:21:42,313 epoch 14 - iter 540/1807 - loss 0.09273735 - samples/sec: 90.64 - lr: 0.100000
2021-12-31 09:21:58,477 epoch 14 - iter 720/1807 - loss 0.09237732 - samples/sec: 89.22 - lr: 0.100000
2021-12-31 09:22:14,088 epoch 14 - iter 900/1807 - loss 0.09290387 - samples/sec: 92.38 - lr: 0.100000
2021-12-31 09:22:29,793 epoch 14 - iter 1080/1807 - loss 0.09305725 - samples/sec: 91.82 - lr: 0.100000
2021-12-31 09:22:45,455 epoch 14 - iter 1260/1807 - loss 0.09321173 - samples/sec: 92.09 - lr: 0.100000
2021-12-31 09:23:01,412 epoch 14 - iter 1440/1807 - loss 0.09321459 - samples/sec: 90.38 - lr: 0.100000
2021-12-31 09:23:17,629 epoch 14 - iter 1620/1807 - loss 0.09332877 - samples/sec: 88.93 - lr: 0.100000
2021-12-31 09:23:33,527 epoch 14 - iter 1800/1807 - loss 0.09313892 - samples/sec: 90.71 - lr: 0.100000
2021-12-31 09:23:34,165 ----------------------------------------------------------------------------------------------------
2021-12-31 09:23:34,165 EPOCH 14 done: loss 0.0931 - lr 0.1000000
2021-12-31 09:24:12,840 DEV : loss 0.06639766693115234 - f1-score (micro avg)  0.9817
2021-12-31 09:24:13,034 BAD EPOCHS (no improvement): 0
2021-12-31 09:24:13,036 saving best model
2021-12-31 09:24:18,822 ----------------------------------------------------------------------------------------------------
2021-12-31 09:24:34,568 epoch 15 - iter 180/1807 - loss 0.09134784 - samples/sec: 91.60 - lr: 0.100000
2021-12-31 09:24:50,712 epoch 15 - iter 360/1807 - loss 0.09119751 - samples/sec: 89.33 - lr: 0.100000
2021-12-31 09:25:07,155 epoch 15 - iter 540/1807 - loss 0.08993505 - samples/sec: 87.70 - lr: 0.100000
2021-12-31 09:25:23,092 epoch 15 - iter 720/1807 - loss 0.09062331 - samples/sec: 90.50 - lr: 0.100000
2021-12-31 09:25:39,643 epoch 15 - iter 900/1807 - loss 0.09054947 - samples/sec: 87.13 - lr: 0.100000
2021-12-31 09:25:56,080 epoch 15 - iter 1080/1807 - loss 0.09120586 - samples/sec: 87.73 - lr: 0.100000
2021-12-31 09:26:12,023 epoch 15 - iter 1260/1807 - loss 0.09202164 - samples/sec: 90.49 - lr: 0.100000
2021-12-31 09:26:27,452 epoch 15 - iter 1440/1807 - loss 0.09257595 - samples/sec: 93.48 - lr: 0.100000
2021-12-31 09:26:43,293 epoch 15 - iter 1620/1807 - loss 0.09296868 - samples/sec: 91.04 - lr: 0.100000
2021-12-31 09:26:59,412 epoch 15 - iter 1800/1807 - loss 0.09272942 - samples/sec: 89.47 - lr: 0.100000
2021-12-31 09:26:59,991 ----------------------------------------------------------------------------------------------------
2021-12-31 09:26:59,991 EPOCH 15 done: loss 0.0927 - lr 0.1000000
2021-12-31 09:27:36,227 DEV : loss 0.06283392012119293 - f1-score (micro avg)  0.982
2021-12-31 09:27:36,433 BAD EPOCHS (no improvement): 0
2021-12-31 09:27:36,435 saving best model
2021-12-31 09:27:42,216 ----------------------------------------------------------------------------------------------------
2021-12-31 09:27:58,274 epoch 16 - iter 180/1807 - loss 0.08868552 - samples/sec: 89.83 - lr: 0.100000
2021-12-31 09:28:14,083 epoch 16 - iter 360/1807 - loss 0.08898795 - samples/sec: 91.23 - lr: 0.100000
2021-12-31 09:28:30,428 epoch 16 - iter 540/1807 - loss 0.08723848 - samples/sec: 88.23 - lr: 0.100000
2021-12-31 09:28:46,065 epoch 16 - iter 720/1807 - loss 0.08840922 - samples/sec: 92.21 - lr: 0.100000
2021-12-31 09:29:01,697 epoch 16 - iter 900/1807 - loss 0.08907246 - samples/sec: 92.26 - lr: 0.100000
2021-12-31 09:29:17,387 epoch 16 - iter 1080/1807 - loss 0.09016391 - samples/sec: 91.91 - lr: 0.100000
2021-12-31 09:29:33,637 epoch 16 - iter 1260/1807 - loss 0.09090909 - samples/sec: 88.74 - lr: 0.100000
2021-12-31 09:29:49,596 epoch 16 - iter 1440/1807 - loss 0.09079363 - samples/sec: 90.36 - lr: 0.100000
2021-12-31 09:30:05,085 epoch 16 - iter 1620/1807 - loss 0.09144623 - samples/sec: 93.12 - lr: 0.100000
2021-12-31 09:30:21,000 epoch 16 - iter 1800/1807 - loss 0.09062250 - samples/sec: 90.62 - lr: 0.100000
2021-12-31 09:30:21,608 ----------------------------------------------------------------------------------------------------
2021-12-31 09:30:21,608 EPOCH 16 done: loss 0.0906 - lr 0.1000000
2021-12-31 09:30:58,333 DEV : loss 0.06354553997516632 - f1-score (micro avg)  0.982
2021-12-31 09:30:58,512 BAD EPOCHS (no improvement): 1
2021-12-31 09:30:58,514 ----------------------------------------------------------------------------------------------------
2021-12-31 09:31:14,847 epoch 17 - iter 180/1807 - loss 0.08390522 - samples/sec: 88.30 - lr: 0.100000
2021-12-31 09:31:30,522 epoch 17 - iter 360/1807 - loss 0.08649584 - samples/sec: 92.01 - lr: 0.100000
2021-12-31 09:31:46,288 epoch 17 - iter 540/1807 - loss 0.08940335 - samples/sec: 91.48 - lr: 0.100000
2021-12-31 09:32:02,118 epoch 17 - iter 720/1807 - loss 0.09059873 - samples/sec: 91.09 - lr: 0.100000
2021-12-31 09:32:17,806 epoch 17 - iter 900/1807 - loss 0.09026440 - samples/sec: 91.93 - lr: 0.100000
2021-12-31 09:32:33,488 epoch 17 - iter 1080/1807 - loss 0.09038711 - samples/sec: 91.96 - lr: 0.100000
2021-12-31 09:32:49,442 epoch 17 - iter 1260/1807 - loss 0.08978670 - samples/sec: 90.39 - lr: 0.100000
2021-12-31 09:33:05,170 epoch 17 - iter 1440/1807 - loss 0.08929018 - samples/sec: 91.69 - lr: 0.100000
2021-12-31 09:33:21,122 epoch 17 - iter 1620/1807 - loss 0.08920206 - samples/sec: 90.40 - lr: 0.100000
2021-12-31 09:33:36,598 epoch 17 - iter 1800/1807 - loss 0.08958801 - samples/sec: 93.18 - lr: 0.100000
2021-12-31 09:33:37,149 ----------------------------------------------------------------------------------------------------
2021-12-31 09:33:37,149 EPOCH 17 done: loss 0.0895 - lr 0.1000000
2021-12-31 09:34:16,446 DEV : loss 0.06361010670661926 - f1-score (micro avg)  0.9823
2021-12-31 09:34:16,599 BAD EPOCHS (no improvement): 0
2021-12-31 09:34:16,600 saving best model
2021-12-31 09:34:22,434 ----------------------------------------------------------------------------------------------------
2021-12-31 09:34:38,419 epoch 18 - iter 180/1807 - loss 0.08343062 - samples/sec: 90.22 - lr: 0.100000
2021-12-31 09:34:54,655 epoch 18 - iter 360/1807 - loss 0.08575852 - samples/sec: 88.82 - lr: 0.100000
2021-12-31 09:35:10,385 epoch 18 - iter 540/1807 - loss 0.08392644 - samples/sec: 91.68 - lr: 0.100000
2021-12-31 09:35:26,310 epoch 18 - iter 720/1807 - loss 0.08351999 - samples/sec: 90.57 - lr: 0.100000
2021-12-31 09:35:41,876 epoch 18 - iter 900/1807 - loss 0.08509375 - samples/sec: 92.64 - lr: 0.100000
2021-12-31 09:35:57,882 epoch 18 - iter 1080/1807 - loss 0.08493115 - samples/sec: 90.10 - lr: 0.100000
2021-12-31 09:36:13,926 epoch 18 - iter 1260/1807 - loss 0.08609299 - samples/sec: 89.88 - lr: 0.100000
2021-12-31 09:36:30,070 epoch 18 - iter 1440/1807 - loss 0.08644835 - samples/sec: 89.34 - lr: 0.100000
2021-12-31 09:36:45,689 epoch 18 - iter 1620/1807 - loss 0.08698449 - samples/sec: 92.33 - lr: 0.100000
2021-12-31 09:37:01,595 epoch 18 - iter 1800/1807 - loss 0.08715385 - samples/sec: 90.66 - lr: 0.100000
2021-12-31 09:37:02,116 ----------------------------------------------------------------------------------------------------
2021-12-31 09:37:02,116 EPOCH 18 done: loss 0.0872 - lr 0.1000000
2021-12-31 09:37:38,287 DEV : loss 0.06376409530639648 - f1-score (micro avg)  0.982
2021-12-31 09:37:38,491 BAD EPOCHS (no improvement): 1
2021-12-31 09:37:38,492 ----------------------------------------------------------------------------------------------------
2021-12-31 09:37:54,464 epoch 19 - iter 180/1807 - loss 0.07802257 - samples/sec: 90.31 - lr: 0.100000
2021-12-31 09:38:10,256 epoch 19 - iter 360/1807 - loss 0.07892620 - samples/sec: 91.32 - lr: 0.100000
2021-12-31 09:38:26,632 epoch 19 - iter 540/1807 - loss 0.08133170 - samples/sec: 88.06 - lr: 0.100000
2021-12-31 09:38:42,673 epoch 19 - iter 720/1807 - loss 0.08367885 - samples/sec: 89.91 - lr: 0.100000
2021-12-31 09:38:58,503 epoch 19 - iter 900/1807 - loss 0.08447871 - samples/sec: 91.11 - lr: 0.100000
2021-12-31 09:39:14,461 epoch 19 - iter 1080/1807 - loss 0.08413767 - samples/sec: 90.37 - lr: 0.100000
2021-12-31 09:39:30,176 epoch 19 - iter 1260/1807 - loss 0.08455665 - samples/sec: 91.77 - lr: 0.100000
2021-12-31 09:39:46,325 epoch 19 - iter 1440/1807 - loss 0.08578599 - samples/sec: 89.30 - lr: 0.100000
2021-12-31 09:40:02,191 epoch 19 - iter 1620/1807 - loss 0.08628902 - samples/sec: 90.90 - lr: 0.100000
2021-12-31 09:40:18,069 epoch 19 - iter 1800/1807 - loss 0.08634962 - samples/sec: 90.82 - lr: 0.100000
2021-12-31 09:40:18,635 ----------------------------------------------------------------------------------------------------
2021-12-31 09:40:18,636 EPOCH 19 done: loss 0.0863 - lr 0.1000000
2021-12-31 09:40:54,638 DEV : loss 0.06360483914613724 - f1-score (micro avg)  0.9824
2021-12-31 09:40:54,809 BAD EPOCHS (no improvement): 0
2021-12-31 09:40:54,812 saving best model
2021-12-31 09:41:00,532 ----------------------------------------------------------------------------------------------------
2021-12-31 09:41:16,605 epoch 20 - iter 180/1807 - loss 0.08580796 - samples/sec: 89.75 - lr: 0.100000
2021-12-31 09:41:32,626 epoch 20 - iter 360/1807 - loss 0.08441046 - samples/sec: 90.02 - lr: 0.100000
2021-12-31 09:41:48,195 epoch 20 - iter 540/1807 - loss 0.08457436 - samples/sec: 92.63 - lr: 0.100000
2021-12-31 09:42:03,884 epoch 20 - iter 720/1807 - loss 0.08433505 - samples/sec: 91.92 - lr: 0.100000
2021-12-31 09:42:19,662 epoch 20 - iter 900/1807 - loss 0.08465375 - samples/sec: 91.40 - lr: 0.100000
2021-12-31 09:42:35,290 epoch 20 - iter 1080/1807 - loss 0.08384813 - samples/sec: 92.28 - lr: 0.100000
2021-12-31 09:42:50,667 epoch 20 - iter 1260/1807 - loss 0.08437448 - samples/sec: 93.79 - lr: 0.100000
2021-12-31 09:43:06,838 epoch 20 - iter 1440/1807 - loss 0.08483000 - samples/sec: 89.18 - lr: 0.100000
2021-12-31 09:43:23,128 epoch 20 - iter 1620/1807 - loss 0.08554680 - samples/sec: 88.52 - lr: 0.100000
2021-12-31 09:43:38,996 epoch 20 - iter 1800/1807 - loss 0.08579345 - samples/sec: 90.89 - lr: 0.100000
2021-12-31 09:43:39,520 ----------------------------------------------------------------------------------------------------
2021-12-31 09:43:39,520 EPOCH 20 done: loss 0.0858 - lr 0.1000000
2021-12-31 09:44:18,433 DEV : loss 0.06494450569152832 - f1-score (micro avg)  0.982
2021-12-31 09:44:18,588 BAD EPOCHS (no improvement): 1
2021-12-31 09:44:18,590 ----------------------------------------------------------------------------------------------------
2021-12-31 09:44:34,495 epoch 21 - iter 180/1807 - loss 0.08058450 - samples/sec: 90.65 - lr: 0.100000
2021-12-31 09:44:50,061 epoch 21 - iter 360/1807 - loss 0.08169987 - samples/sec: 92.62 - lr: 0.100000
2021-12-31 09:45:05,780 epoch 21 - iter 540/1807 - loss 0.08147401 - samples/sec: 91.76 - lr: 0.100000
2021-12-31 09:45:21,869 epoch 21 - iter 720/1807 - loss 0.08235327 - samples/sec: 89.64 - lr: 0.100000
2021-12-31 09:45:38,316 epoch 21 - iter 900/1807 - loss 0.08324710 - samples/sec: 87.67 - lr: 0.100000
2021-12-31 09:45:54,314 epoch 21 - iter 1080/1807 - loss 0.08294963 - samples/sec: 90.14 - lr: 0.100000
2021-12-31 09:46:10,369 epoch 21 - iter 1260/1807 - loss 0.08355307 - samples/sec: 89.83 - lr: 0.100000
2021-12-31 09:46:26,469 epoch 21 - iter 1440/1807 - loss 0.08343050 - samples/sec: 89.57 - lr: 0.100000
2021-12-31 09:46:42,401 epoch 21 - iter 1620/1807 - loss 0.08414815 - samples/sec: 90.52 - lr: 0.100000
2021-12-31 09:46:58,257 epoch 21 - iter 1800/1807 - loss 0.08376554 - samples/sec: 90.95 - lr: 0.100000
2021-12-31 09:46:58,880 ----------------------------------------------------------------------------------------------------
2021-12-31 09:46:58,880 EPOCH 21 done: loss 0.0839 - lr 0.1000000
2021-12-31 09:47:35,248 DEV : loss 0.06328344345092773 - f1-score (micro avg)  0.9827
2021-12-31 09:47:35,446 BAD EPOCHS (no improvement): 0
2021-12-31 09:47:35,448 saving best model
2021-12-31 09:47:41,248 ----------------------------------------------------------------------------------------------------
2021-12-31 09:47:57,255 epoch 22 - iter 180/1807 - loss 0.08050373 - samples/sec: 90.12 - lr: 0.100000
2021-12-31 09:48:13,186 epoch 22 - iter 360/1807 - loss 0.08239139 - samples/sec: 90.52 - lr: 0.100000
2021-12-31 09:48:29,067 epoch 22 - iter 540/1807 - loss 0.08228212 - samples/sec: 90.81 - lr: 0.100000
2021-12-31 09:48:45,039 epoch 22 - iter 720/1807 - loss 0.08279713 - samples/sec: 90.30 - lr: 0.100000
2021-12-31 09:49:00,510 epoch 22 - iter 900/1807 - loss 0.08334789 - samples/sec: 93.22 - lr: 0.100000
2021-12-31 09:49:16,362 epoch 22 - iter 1080/1807 - loss 0.08342389 - samples/sec: 90.97 - lr: 0.100000
2021-12-31 09:49:32,567 epoch 22 - iter 1260/1807 - loss 0.08349166 - samples/sec: 88.99 - lr: 0.100000
2021-12-31 09:49:48,320 epoch 22 - iter 1440/1807 - loss 0.08427908 - samples/sec: 91.55 - lr: 0.100000
2021-12-31 09:50:04,570 epoch 22 - iter 1620/1807 - loss 0.08465300 - samples/sec: 88.75 - lr: 0.100000
2021-12-31 09:50:20,943 epoch 22 - iter 1800/1807 - loss 0.08437528 - samples/sec: 88.07 - lr: 0.100000
2021-12-31 09:50:21,480 ----------------------------------------------------------------------------------------------------
2021-12-31 09:50:21,480 EPOCH 22 done: loss 0.0844 - lr 0.1000000
2021-12-31 09:50:58,771 DEV : loss 0.06346500664949417 - f1-score (micro avg)  0.9815
2021-12-31 09:50:58,967 BAD EPOCHS (no improvement): 1
2021-12-31 09:50:58,969 ----------------------------------------------------------------------------------------------------
2021-12-31 09:51:15,272 epoch 23 - iter 180/1807 - loss 0.07857499 - samples/sec: 88.47 - lr: 0.100000
2021-12-31 09:51:31,123 epoch 23 - iter 360/1807 - loss 0.07736816 - samples/sec: 91.00 - lr: 0.100000
2021-12-31 09:51:47,441 epoch 23 - iter 540/1807 - loss 0.07865886 - samples/sec: 88.38 - lr: 0.100000
2021-12-31 09:52:03,508 epoch 23 - iter 720/1807 - loss 0.08053686 - samples/sec: 89.75 - lr: 0.100000
2021-12-31 09:52:19,618 epoch 23 - iter 900/1807 - loss 0.08084826 - samples/sec: 89.52 - lr: 0.100000
2021-12-31 09:52:35,467 epoch 23 - iter 1080/1807 - loss 0.08116025 - samples/sec: 91.00 - lr: 0.100000
2021-12-31 09:52:51,307 epoch 23 - iter 1260/1807 - loss 0.08137722 - samples/sec: 91.04 - lr: 0.100000
2021-12-31 09:53:07,605 epoch 23 - iter 1440/1807 - loss 0.08168418 - samples/sec: 88.48 - lr: 0.100000
2021-12-31 09:53:23,242 epoch 23 - iter 1620/1807 - loss 0.08161521 - samples/sec: 92.22 - lr: 0.100000
2021-12-31 09:53:38,917 epoch 23 - iter 1800/1807 - loss 0.08147531 - samples/sec: 92.01 - lr: 0.100000
2021-12-31 09:53:39,396 ----------------------------------------------------------------------------------------------------
2021-12-31 09:53:39,396 EPOCH 23 done: loss 0.0814 - lr 0.1000000
2021-12-31 09:54:15,841 DEV : loss 0.06540019810199738 - f1-score (micro avg)  0.9821
2021-12-31 09:54:16,023 BAD EPOCHS (no improvement): 2
2021-12-31 09:54:16,025 ----------------------------------------------------------------------------------------------------
2021-12-31 09:54:32,334 epoch 24 - iter 180/1807 - loss 0.07795468 - samples/sec: 88.43 - lr: 0.100000
2021-12-31 09:54:48,084 epoch 24 - iter 360/1807 - loss 0.07908717 - samples/sec: 91.57 - lr: 0.100000
2021-12-31 09:55:04,326 epoch 24 - iter 540/1807 - loss 0.08004992 - samples/sec: 88.79 - lr: 0.100000
2021-12-31 09:55:20,651 epoch 24 - iter 720/1807 - loss 0.08100541 - samples/sec: 88.34 - lr: 0.100000
2021-12-31 09:55:36,785 epoch 24 - iter 900/1807 - loss 0.08142507 - samples/sec: 89.38 - lr: 0.100000
2021-12-31 09:55:52,742 epoch 24 - iter 1080/1807 - loss 0.08232817 - samples/sec: 90.38 - lr: 0.100000
2021-12-31 09:56:08,164 epoch 24 - iter 1260/1807 - loss 0.08188184 - samples/sec: 93.53 - lr: 0.100000
2021-12-31 09:56:24,063 epoch 24 - iter 1440/1807 - loss 0.08243719 - samples/sec: 90.71 - lr: 0.100000
2021-12-31 09:56:40,384 epoch 24 - iter 1620/1807 - loss 0.08222346 - samples/sec: 88.35 - lr: 0.100000
2021-12-31 09:56:56,011 epoch 24 - iter 1800/1807 - loss 0.08229498 - samples/sec: 92.29 - lr: 0.100000
2021-12-31 09:56:56,616 ----------------------------------------------------------------------------------------------------
2021-12-31 09:56:56,616 EPOCH 24 done: loss 0.0822 - lr 0.1000000
2021-12-31 09:57:35,721 DEV : loss 0.06453310698270798 - f1-score (micro avg)  0.9819
2021-12-31 09:57:35,917 BAD EPOCHS (no improvement): 3
2021-12-31 09:57:35,919 ----------------------------------------------------------------------------------------------------
2021-12-31 09:57:52,048 epoch 25 - iter 180/1807 - loss 0.07765362 - samples/sec: 89.42 - lr: 0.100000
2021-12-31 09:58:07,956 epoch 25 - iter 360/1807 - loss 0.07932940 - samples/sec: 90.65 - lr: 0.100000
2021-12-31 09:58:23,863 epoch 25 - iter 540/1807 - loss 0.08046614 - samples/sec: 90.65 - lr: 0.100000
2021-12-31 09:58:39,725 epoch 25 - iter 720/1807 - loss 0.07941669 - samples/sec: 90.92 - lr: 0.100000
2021-12-31 09:58:55,303 epoch 25 - iter 900/1807 - loss 0.08092722 - samples/sec: 92.57 - lr: 0.100000
2021-12-31 09:59:11,794 epoch 25 - iter 1080/1807 - loss 0.08150485 - samples/sec: 87.44 - lr: 0.100000
2021-12-31 09:59:27,795 epoch 25 - iter 1260/1807 - loss 0.08118184 - samples/sec: 90.13 - lr: 0.100000
2021-12-31 09:59:43,595 epoch 25 - iter 1440/1807 - loss 0.08068256 - samples/sec: 91.28 - lr: 0.100000
2021-12-31 09:59:59,146 epoch 25 - iter 1620/1807 - loss 0.08113371 - samples/sec: 92.74 - lr: 0.100000
2021-12-31 10:00:14,684 epoch 25 - iter 1800/1807 - loss 0.08112289 - samples/sec: 92.81 - lr: 0.100000
2021-12-31 10:00:15,230 ----------------------------------------------------------------------------------------------------
2021-12-31 10:00:15,230 EPOCH 25 done: loss 0.0812 - lr 0.1000000
2021-12-31 10:00:51,681 DEV : loss 0.06579063087701797 - f1-score (micro avg)  0.9817
2021-12-31 10:00:51,872 BAD EPOCHS (no improvement): 4
2021-12-31 10:00:51,874 ----------------------------------------------------------------------------------------------------
2021-12-31 10:01:08,252 epoch 26 - iter 180/1807 - loss 0.07473820 - samples/sec: 88.06 - lr: 0.050000
2021-12-31 10:01:24,095 epoch 26 - iter 360/1807 - loss 0.07741051 - samples/sec: 91.03 - lr: 0.050000
2021-12-31 10:01:40,042 epoch 26 - iter 540/1807 - loss 0.07612793 - samples/sec: 90.43 - lr: 0.050000
2021-12-31 10:01:55,977 epoch 26 - iter 720/1807 - loss 0.07597233 - samples/sec: 90.49 - lr: 0.050000
2021-12-31 10:02:12,264 epoch 26 - iter 900/1807 - loss 0.07560347 - samples/sec: 88.55 - lr: 0.050000
2021-12-31 10:02:28,030 epoch 26 - iter 1080/1807 - loss 0.07626889 - samples/sec: 91.47 - lr: 0.050000
2021-12-31 10:02:43,691 epoch 26 - iter 1260/1807 - loss 0.07613186 - samples/sec: 92.08 - lr: 0.050000
2021-12-31 10:02:59,223 epoch 26 - iter 1440/1807 - loss 0.07558384 - samples/sec: 92.85 - lr: 0.050000
2021-12-31 10:03:15,259 epoch 26 - iter 1620/1807 - loss 0.07503334 - samples/sec: 89.93 - lr: 0.050000
2021-12-31 10:03:31,614 epoch 26 - iter 1800/1807 - loss 0.07448614 - samples/sec: 88.18 - lr: 0.050000
2021-12-31 10:03:32,151 ----------------------------------------------------------------------------------------------------
2021-12-31 10:03:32,151 EPOCH 26 done: loss 0.0744 - lr 0.0500000
2021-12-31 10:04:08,767 DEV : loss 0.06646668165922165 - f1-score (micro avg)  0.9822
2021-12-31 10:04:08,949 BAD EPOCHS (no improvement): 1
2021-12-31 10:04:08,950 ----------------------------------------------------------------------------------------------------
2021-12-31 10:04:25,529 epoch 27 - iter 180/1807 - loss 0.06581114 - samples/sec: 86.99 - lr: 0.050000
2021-12-31 10:04:41,436 epoch 27 - iter 360/1807 - loss 0.06857834 - samples/sec: 90.66 - lr: 0.050000
2021-12-31 10:04:57,191 epoch 27 - iter 540/1807 - loss 0.07081005 - samples/sec: 91.54 - lr: 0.050000
2021-12-31 10:05:13,183 epoch 27 - iter 720/1807 - loss 0.07198836 - samples/sec: 90.18 - lr: 0.050000
2021-12-31 10:05:29,131 epoch 27 - iter 900/1807 - loss 0.07153264 - samples/sec: 90.42 - lr: 0.050000
2021-12-31 10:05:44,864 epoch 27 - iter 1080/1807 - loss 0.07164274 - samples/sec: 91.66 - lr: 0.050000
2021-12-31 10:06:00,643 epoch 27 - iter 1260/1807 - loss 0.07167991 - samples/sec: 91.40 - lr: 0.050000
2021-12-31 10:06:15,929 epoch 27 - iter 1440/1807 - loss 0.07130117 - samples/sec: 94.34 - lr: 0.050000
2021-12-31 10:06:32,208 epoch 27 - iter 1620/1807 - loss 0.07137995 - samples/sec: 88.59 - lr: 0.050000
2021-12-31 10:06:48,072 epoch 27 - iter 1800/1807 - loss 0.07123898 - samples/sec: 90.90 - lr: 0.050000
2021-12-31 10:06:48,616 ----------------------------------------------------------------------------------------------------
2021-12-31 10:06:48,616 EPOCH 27 done: loss 0.0712 - lr 0.0500000
2021-12-31 10:07:27,769 DEV : loss 0.06514652073383331 - f1-score (micro avg)  0.9823
2021-12-31 10:07:27,967 BAD EPOCHS (no improvement): 2
2021-12-31 10:07:27,968 ----------------------------------------------------------------------------------------------------
2021-12-31 10:07:43,921 epoch 28 - iter 180/1807 - loss 0.06865415 - samples/sec: 90.41 - lr: 0.050000
2021-12-31 10:08:00,073 epoch 28 - iter 360/1807 - loss 0.06855531 - samples/sec: 89.28 - lr: 0.050000
2021-12-31 10:08:16,259 epoch 28 - iter 540/1807 - loss 0.06891820 - samples/sec: 89.09 - lr: 0.050000
2021-12-31 10:08:31,981 epoch 28 - iter 720/1807 - loss 0.06951336 - samples/sec: 91.73 - lr: 0.050000
2021-12-31 10:08:47,429 epoch 28 - iter 900/1807 - loss 0.07014278 - samples/sec: 93.35 - lr: 0.050000
2021-12-31 10:09:03,024 epoch 28 - iter 1080/1807 - loss 0.07071541 - samples/sec: 92.47 - lr: 0.050000
2021-12-31 10:09:18,974 epoch 28 - iter 1260/1807 - loss 0.07012373 - samples/sec: 90.41 - lr: 0.050000
2021-12-31 10:09:34,620 epoch 28 - iter 1440/1807 - loss 0.07028479 - samples/sec: 92.17 - lr: 0.050000
2021-12-31 10:09:50,427 epoch 28 - iter 1620/1807 - loss 0.07017402 - samples/sec: 91.23 - lr: 0.050000
2021-12-31 10:10:05,997 epoch 28 - iter 1800/1807 - loss 0.07002142 - samples/sec: 92.62 - lr: 0.050000
2021-12-31 10:10:06,547 ----------------------------------------------------------------------------------------------------
2021-12-31 10:10:06,548 EPOCH 28 done: loss 0.0701 - lr 0.0500000
2021-12-31 10:10:43,342 DEV : loss 0.06285692006349564 - f1-score (micro avg)  0.9828
2021-12-31 10:10:43,549 BAD EPOCHS (no improvement): 0
2021-12-31 10:10:43,550 saving best model
2021-12-31 10:10:49,346 ----------------------------------------------------------------------------------------------------
2021-12-31 10:11:05,893 epoch 29 - iter 180/1807 - loss 0.06749112 - samples/sec: 87.17 - lr: 0.050000
2021-12-31 10:11:21,660 epoch 29 - iter 360/1807 - loss 0.06704871 - samples/sec: 91.46 - lr: 0.050000
2021-12-31 10:11:37,404 epoch 29 - iter 540/1807 - loss 0.06846136 - samples/sec: 91.60 - lr: 0.050000
2021-12-31 10:11:53,397 epoch 29 - iter 720/1807 - loss 0.06901632 - samples/sec: 90.17 - lr: 0.050000
2021-12-31 10:12:09,257 epoch 29 - iter 900/1807 - loss 0.06809349 - samples/sec: 90.93 - lr: 0.050000
2021-12-31 10:12:24,599 epoch 29 - iter 1080/1807 - loss 0.06824897 - samples/sec: 94.00 - lr: 0.050000
2021-12-31 10:12:40,447 epoch 29 - iter 1260/1807 - loss 0.06782382 - samples/sec: 91.00 - lr: 0.050000
2021-12-31 10:12:56,595 epoch 29 - iter 1440/1807 - loss 0.06808796 - samples/sec: 89.30 - lr: 0.050000
2021-12-31 10:13:12,755 epoch 29 - iter 1620/1807 - loss 0.06798634 - samples/sec: 89.24 - lr: 0.050000
2021-12-31 10:13:28,701 epoch 29 - iter 1800/1807 - loss 0.06777472 - samples/sec: 90.44 - lr: 0.050000
2021-12-31 10:13:29,227 ----------------------------------------------------------------------------------------------------
2021-12-31 10:13:29,228 EPOCH 29 done: loss 0.0678 - lr 0.0500000
2021-12-31 10:14:05,041 DEV : loss 0.06288447976112366 - f1-score (micro avg)  0.9831
2021-12-31 10:14:05,221 BAD EPOCHS (no improvement): 0
2021-12-31 10:14:05,222 saving best model
2021-12-31 10:14:10,675 ----------------------------------------------------------------------------------------------------
2021-12-31 10:14:26,845 epoch 30 - iter 180/1807 - loss 0.06615046 - samples/sec: 89.20 - lr: 0.050000
2021-12-31 10:14:42,781 epoch 30 - iter 360/1807 - loss 0.06701908 - samples/sec: 90.50 - lr: 0.050000
2021-12-31 10:14:58,746 epoch 30 - iter 540/1807 - loss 0.06748578 - samples/sec: 90.33 - lr: 0.050000
2021-12-31 10:15:14,479 epoch 30 - iter 720/1807 - loss 0.06796474 - samples/sec: 91.66 - lr: 0.050000
2021-12-31 10:15:30,280 epoch 30 - iter 900/1807 - loss 0.06739311 - samples/sec: 91.26 - lr: 0.050000
2021-12-31 10:15:45,933 epoch 30 - iter 1080/1807 - loss 0.06699810 - samples/sec: 92.13 - lr: 0.050000
2021-12-31 10:16:01,690 epoch 30 - iter 1260/1807 - loss 0.06745951 - samples/sec: 91.53 - lr: 0.050000
2021-12-31 10:16:17,453 epoch 30 - iter 1440/1807 - loss 0.06704309 - samples/sec: 91.49 - lr: 0.050000
2021-12-31 10:16:33,233 epoch 30 - iter 1620/1807 - loss 0.06649743 - samples/sec: 91.38 - lr: 0.050000
2021-12-31 10:16:49,143 epoch 30 - iter 1800/1807 - loss 0.06655280 - samples/sec: 90.65 - lr: 0.050000
2021-12-31 10:16:49,685 ----------------------------------------------------------------------------------------------------
2021-12-31 10:16:49,685 EPOCH 30 done: loss 0.0666 - lr 0.0500000
2021-12-31 10:17:28,240 DEV : loss 0.06311798095703125 - f1-score (micro avg)  0.9824
2021-12-31 10:17:28,433 BAD EPOCHS (no improvement): 1
2021-12-31 10:17:28,434 ----------------------------------------------------------------------------------------------------
2021-12-31 10:17:44,966 epoch 31 - iter 180/1807 - loss 0.06627745 - samples/sec: 87.24 - lr: 0.050000
2021-12-31 10:18:00,662 epoch 31 - iter 360/1807 - loss 0.06286711 - samples/sec: 91.88 - lr: 0.050000
2021-12-31 10:18:16,307 epoch 31 - iter 540/1807 - loss 0.06454841 - samples/sec: 92.17 - lr: 0.050000
2021-12-31 10:18:32,243 epoch 31 - iter 720/1807 - loss 0.06465161 - samples/sec: 90.50 - lr: 0.050000
2021-12-31 10:18:47,799 epoch 31 - iter 900/1807 - loss 0.06488043 - samples/sec: 92.70 - lr: 0.050000
2021-12-31 10:19:03,602 epoch 31 - iter 1080/1807 - loss 0.06501278 - samples/sec: 91.26 - lr: 0.050000
2021-12-31 10:19:19,610 epoch 31 - iter 1260/1807 - loss 0.06524649 - samples/sec: 90.08 - lr: 0.050000
2021-12-31 10:19:35,038 epoch 31 - iter 1440/1807 - loss 0.06554492 - samples/sec: 93.48 - lr: 0.050000
2021-12-31 10:19:51,164 epoch 31 - iter 1620/1807 - loss 0.06599922 - samples/sec: 89.43 - lr: 0.050000
2021-12-31 10:20:07,078 epoch 31 - iter 1800/1807 - loss 0.06644678 - samples/sec: 90.61 - lr: 0.050000
2021-12-31 10:20:07,640 ----------------------------------------------------------------------------------------------------
2021-12-31 10:20:07,640 EPOCH 31 done: loss 0.0666 - lr 0.0500000
2021-12-31 10:20:43,927 DEV : loss 0.06285466253757477 - f1-score (micro avg)  0.9829
2021-12-31 10:20:44,123 BAD EPOCHS (no improvement): 2
2021-12-31 10:20:44,125 ----------------------------------------------------------------------------------------------------
2021-12-31 10:21:00,298 epoch 32 - iter 180/1807 - loss 0.06077116 - samples/sec: 89.18 - lr: 0.050000
2021-12-31 10:21:16,393 epoch 32 - iter 360/1807 - loss 0.06270324 - samples/sec: 89.60 - lr: 0.050000
2021-12-31 10:21:32,158 epoch 32 - iter 540/1807 - loss 0.06340224 - samples/sec: 91.47 - lr: 0.050000
2021-12-31 10:21:48,183 epoch 32 - iter 720/1807 - loss 0.06267842 - samples/sec: 89.99 - lr: 0.050000
2021-12-31 10:22:03,949 epoch 32 - iter 900/1807 - loss 0.06345792 - samples/sec: 91.50 - lr: 0.050000
2021-12-31 10:22:19,674 epoch 32 - iter 1080/1807 - loss 0.06439376 - samples/sec: 91.71 - lr: 0.050000
2021-12-31 10:22:35,414 epoch 32 - iter 1260/1807 - loss 0.06437464 - samples/sec: 91.63 - lr: 0.050000
2021-12-31 10:22:51,702 epoch 32 - iter 1440/1807 - loss 0.06435182 - samples/sec: 88.53 - lr: 0.050000
2021-12-31 10:23:07,918 epoch 32 - iter 1620/1807 - loss 0.06467809 - samples/sec: 88.93 - lr: 0.050000
2021-12-31 10:23:23,880 epoch 32 - iter 1800/1807 - loss 0.06484923 - samples/sec: 90.35 - lr: 0.050000
2021-12-31 10:23:24,513 ----------------------------------------------------------------------------------------------------
2021-12-31 10:23:24,513 EPOCH 32 done: loss 0.0648 - lr 0.0500000
2021-12-31 10:24:00,678 DEV : loss 0.062373436987400055 - f1-score (micro avg)  0.9827
2021-12-31 10:24:00,863 BAD EPOCHS (no improvement): 3
2021-12-31 10:24:00,865 ----------------------------------------------------------------------------------------------------
2021-12-31 10:24:17,368 epoch 33 - iter 180/1807 - loss 0.06511517 - samples/sec: 87.39 - lr: 0.050000
2021-12-31 10:24:33,869 epoch 33 - iter 360/1807 - loss 0.06359714 - samples/sec: 87.39 - lr: 0.050000
2021-12-31 10:24:49,974 epoch 33 - iter 540/1807 - loss 0.06324776 - samples/sec: 89.54 - lr: 0.050000
2021-12-31 10:25:05,411 epoch 33 - iter 720/1807 - loss 0.06296883 - samples/sec: 93.42 - lr: 0.050000
2021-12-31 10:25:21,477 epoch 33 - iter 900/1807 - loss 0.06304943 - samples/sec: 89.76 - lr: 0.050000
2021-12-31 10:25:37,062 epoch 33 - iter 1080/1807 - loss 0.06266940 - samples/sec: 92.52 - lr: 0.050000
2021-12-31 10:25:52,743 epoch 33 - iter 1260/1807 - loss 0.06359599 - samples/sec: 91.97 - lr: 0.050000
2021-12-31 10:26:08,521 epoch 33 - iter 1440/1807 - loss 0.06353058 - samples/sec: 91.40 - lr: 0.050000
2021-12-31 10:26:24,080 epoch 33 - iter 1620/1807 - loss 0.06366170 - samples/sec: 92.69 - lr: 0.050000
2021-12-31 10:26:39,568 epoch 33 - iter 1800/1807 - loss 0.06405823 - samples/sec: 93.11 - lr: 0.050000
2021-12-31 10:26:40,121 ----------------------------------------------------------------------------------------------------
2021-12-31 10:26:40,121 EPOCH 33 done: loss 0.0640 - lr 0.0500000
2021-12-31 10:27:18,678 DEV : loss 0.06352584064006805 - f1-score (micro avg)  0.983
2021-12-31 10:27:18,875 BAD EPOCHS (no improvement): 4
2021-12-31 10:27:18,877 ----------------------------------------------------------------------------------------------------
2021-12-31 10:27:34,632 epoch 34 - iter 180/1807 - loss 0.05738992 - samples/sec: 91.55 - lr: 0.025000
2021-12-31 10:27:50,783 epoch 34 - iter 360/1807 - loss 0.05964139 - samples/sec: 89.29 - lr: 0.025000
2021-12-31 10:28:06,956 epoch 34 - iter 540/1807 - loss 0.05950577 - samples/sec: 89.16 - lr: 0.025000
2021-12-31 10:28:23,264 epoch 34 - iter 720/1807 - loss 0.06033373 - samples/sec: 88.43 - lr: 0.025000
2021-12-31 10:28:38,762 epoch 34 - iter 900/1807 - loss 0.06053852 - samples/sec: 93.06 - lr: 0.025000
2021-12-31 10:28:54,790 epoch 34 - iter 1080/1807 - loss 0.06008683 - samples/sec: 89.97 - lr: 0.025000
2021-12-31 10:29:10,752 epoch 34 - iter 1260/1807 - loss 0.06017032 - samples/sec: 90.34 - lr: 0.025000
2021-12-31 10:29:26,533 epoch 34 - iter 1440/1807 - loss 0.06026720 - samples/sec: 91.39 - lr: 0.025000
2021-12-31 10:29:41,962 epoch 34 - iter 1620/1807 - loss 0.06023939 - samples/sec: 93.47 - lr: 0.025000
2021-12-31 10:29:57,974 epoch 34 - iter 1800/1807 - loss 0.06024915 - samples/sec: 90.06 - lr: 0.025000
2021-12-31 10:29:58,641 ----------------------------------------------------------------------------------------------------
2021-12-31 10:29:58,642 EPOCH 34 done: loss 0.0602 - lr 0.0250000
2021-12-31 10:30:34,901 DEV : loss 0.06348917633295059 - f1-score (micro avg)  0.9835
2021-12-31 10:30:35,087 BAD EPOCHS (no improvement): 0
2021-12-31 10:30:35,089 saving best model
2021-12-31 10:30:40,883 ----------------------------------------------------------------------------------------------------
2021-12-31 10:30:57,202 epoch 35 - iter 180/1807 - loss 0.05878333 - samples/sec: 88.38 - lr: 0.025000
2021-12-31 10:31:12,996 epoch 35 - iter 360/1807 - loss 0.05795906 - samples/sec: 91.32 - lr: 0.025000
2021-12-31 10:31:29,079 epoch 35 - iter 540/1807 - loss 0.05935994 - samples/sec: 89.67 - lr: 0.025000
2021-12-31 10:31:45,084 epoch 35 - iter 720/1807 - loss 0.05982168 - samples/sec: 90.10 - lr: 0.025000
2021-12-31 10:32:00,692 epoch 35 - iter 900/1807 - loss 0.05928538 - samples/sec: 92.39 - lr: 0.025000
2021-12-31 10:32:16,615 epoch 35 - iter 1080/1807 - loss 0.05961166 - samples/sec: 90.58 - lr: 0.025000
2021-12-31 10:32:32,475 epoch 35 - iter 1260/1807 - loss 0.06019352 - samples/sec: 90.93 - lr: 0.025000
2021-12-31 10:32:48,494 epoch 35 - iter 1440/1807 - loss 0.06020781 - samples/sec: 90.02 - lr: 0.025000
2021-12-31 10:33:04,244 epoch 35 - iter 1620/1807 - loss 0.05999299 - samples/sec: 91.57 - lr: 0.025000
2021-12-31 10:33:20,684 epoch 35 - iter 1800/1807 - loss 0.05998842 - samples/sec: 87.72 - lr: 0.025000
2021-12-31 10:33:21,238 ----------------------------------------------------------------------------------------------------
2021-12-31 10:33:21,238 EPOCH 35 done: loss 0.0600 - lr 0.0250000
2021-12-31 10:33:57,434 DEV : loss 0.06338120251893997 - f1-score (micro avg)  0.9829
2021-12-31 10:33:57,624 BAD EPOCHS (no improvement): 1
2021-12-31 10:33:57,626 ----------------------------------------------------------------------------------------------------
2021-12-31 10:34:13,768 epoch 36 - iter 180/1807 - loss 0.06028850 - samples/sec: 89.35 - lr: 0.025000
2021-12-31 10:34:29,556 epoch 36 - iter 360/1807 - loss 0.05827195 - samples/sec: 91.34 - lr: 0.025000
2021-12-31 10:34:46,060 epoch 36 - iter 540/1807 - loss 0.05947832 - samples/sec: 87.38 - lr: 0.025000
2021-12-31 10:35:02,018 epoch 36 - iter 720/1807 - loss 0.05898679 - samples/sec: 90.38 - lr: 0.025000
2021-12-31 10:35:18,203 epoch 36 - iter 900/1807 - loss 0.05910041 - samples/sec: 89.10 - lr: 0.025000
2021-12-31 10:35:34,254 epoch 36 - iter 1080/1807 - loss 0.05973540 - samples/sec: 89.84 - lr: 0.025000
2021-12-31 10:35:50,256 epoch 36 - iter 1260/1807 - loss 0.05924335 - samples/sec: 90.13 - lr: 0.025000
2021-12-31 10:36:06,236 epoch 36 - iter 1440/1807 - loss 0.05881263 - samples/sec: 90.25 - lr: 0.025000
2021-12-31 10:36:22,117 epoch 36 - iter 1620/1807 - loss 0.05885928 - samples/sec: 90.80 - lr: 0.025000
2021-12-31 10:36:38,208 epoch 36 - iter 1800/1807 - loss 0.05867245 - samples/sec: 89.62 - lr: 0.025000
2021-12-31 10:36:38,763 ----------------------------------------------------------------------------------------------------
2021-12-31 10:36:38,763 EPOCH 36 done: loss 0.0587 - lr 0.0250000
2021-12-31 10:37:17,552 DEV : loss 0.06424003839492798 - f1-score (micro avg)  0.9835
2021-12-31 10:37:17,751 BAD EPOCHS (no improvement): 2
2021-12-31 10:37:17,752 ----------------------------------------------------------------------------------------------------
2021-12-31 10:37:33,804 epoch 37 - iter 180/1807 - loss 0.05692650 - samples/sec: 89.85 - lr: 0.025000
2021-12-31 10:37:50,368 epoch 37 - iter 360/1807 - loss 0.05616469 - samples/sec: 87.06 - lr: 0.025000
2021-12-31 10:38:06,389 epoch 37 - iter 540/1807 - loss 0.05662717 - samples/sec: 90.01 - lr: 0.025000
2021-12-31 10:38:22,399 epoch 37 - iter 720/1807 - loss 0.05716632 - samples/sec: 90.08 - lr: 0.025000
2021-12-31 10:38:37,783 epoch 37 - iter 900/1807 - loss 0.05713545 - samples/sec: 93.74 - lr: 0.025000
2021-12-31 10:38:53,871 epoch 37 - iter 1080/1807 - loss 0.05764661 - samples/sec: 89.64 - lr: 0.025000
2021-12-31 10:39:10,031 epoch 37 - iter 1260/1807 - loss 0.05713711 - samples/sec: 89.23 - lr: 0.025000
2021-12-31 10:39:25,737 epoch 37 - iter 1440/1807 - loss 0.05769197 - samples/sec: 91.83 - lr: 0.025000
2021-12-31 10:39:41,486 epoch 37 - iter 1620/1807 - loss 0.05788084 - samples/sec: 91.57 - lr: 0.025000
2021-12-31 10:39:57,218 epoch 37 - iter 1800/1807 - loss 0.05864320 - samples/sec: 91.67 - lr: 0.025000
2021-12-31 10:39:57,747 ----------------------------------------------------------------------------------------------------
2021-12-31 10:39:57,748 EPOCH 37 done: loss 0.0586 - lr 0.0250000
2021-12-31 10:40:34,869 DEV : loss 0.06326954811811447 - f1-score (micro avg)  0.9831
2021-12-31 10:40:35,052 BAD EPOCHS (no improvement): 3
2021-12-31 10:40:35,054 ----------------------------------------------------------------------------------------------------
2021-12-31 10:40:51,312 epoch 38 - iter 180/1807 - loss 0.05496563 - samples/sec: 88.71 - lr: 0.025000
2021-12-31 10:41:07,088 epoch 38 - iter 360/1807 - loss 0.05435886 - samples/sec: 91.42 - lr: 0.025000
2021-12-31 10:41:22,841 epoch 38 - iter 540/1807 - loss 0.05464384 - samples/sec: 91.55 - lr: 0.025000
2021-12-31 10:41:38,398 epoch 38 - iter 720/1807 - loss 0.05548335 - samples/sec: 92.69 - lr: 0.025000
2021-12-31 10:41:54,754 epoch 38 - iter 900/1807 - loss 0.05628518 - samples/sec: 88.18 - lr: 0.025000
2021-12-31 10:42:10,229 epoch 38 - iter 1080/1807 - loss 0.05604961 - samples/sec: 93.19 - lr: 0.025000
2021-12-31 10:42:26,417 epoch 38 - iter 1260/1807 - loss 0.05594531 - samples/sec: 89.09 - lr: 0.025000
2021-12-31 10:42:42,839 epoch 38 - iter 1440/1807 - loss 0.05651329 - samples/sec: 87.81 - lr: 0.025000
2021-12-31 10:42:58,889 epoch 38 - iter 1620/1807 - loss 0.05695998 - samples/sec: 89.85 - lr: 0.025000
2021-12-31 10:43:15,043 epoch 38 - iter 1800/1807 - loss 0.05706783 - samples/sec: 89.27 - lr: 0.025000
2021-12-31 10:43:15,590 ----------------------------------------------------------------------------------------------------
2021-12-31 10:43:15,590 EPOCH 38 done: loss 0.0570 - lr 0.0250000
2021-12-31 10:43:52,423 DEV : loss 0.06343492120504379 - f1-score (micro avg)  0.9831
2021-12-31 10:43:52,610 BAD EPOCHS (no improvement): 4
2021-12-31 10:43:52,612 ----------------------------------------------------------------------------------------------------
2021-12-31 10:44:08,739 epoch 39 - iter 180/1807 - loss 0.05834451 - samples/sec: 89.43 - lr: 0.012500
2021-12-31 10:44:24,462 epoch 39 - iter 360/1807 - loss 0.05496382 - samples/sec: 91.72 - lr: 0.012500
2021-12-31 10:44:40,570 epoch 39 - iter 540/1807 - loss 0.05537094 - samples/sec: 89.53 - lr: 0.012500
2021-12-31 10:44:56,434 epoch 39 - iter 720/1807 - loss 0.05546561 - samples/sec: 90.90 - lr: 0.012500
2021-12-31 10:45:12,338 epoch 39 - iter 900/1807 - loss 0.05527723 - samples/sec: 90.67 - lr: 0.012500
2021-12-31 10:45:27,903 epoch 39 - iter 1080/1807 - loss 0.05518412 - samples/sec: 92.65 - lr: 0.012500
2021-12-31 10:45:43,777 epoch 39 - iter 1260/1807 - loss 0.05540916 - samples/sec: 90.86 - lr: 0.012500
2021-12-31 10:45:59,259 epoch 39 - iter 1440/1807 - loss 0.05568263 - samples/sec: 93.15 - lr: 0.012500
2021-12-31 10:46:15,024 epoch 39 - iter 1620/1807 - loss 0.05532678 - samples/sec: 91.47 - lr: 0.012500
2021-12-31 10:46:30,975 epoch 39 - iter 1800/1807 - loss 0.05524694 - samples/sec: 90.40 - lr: 0.012500
2021-12-31 10:46:31,584 ----------------------------------------------------------------------------------------------------
2021-12-31 10:46:31,585 EPOCH 39 done: loss 0.0552 - lr 0.0125000
2021-12-31 10:47:10,908 DEV : loss 0.06419230252504349 - f1-score (micro avg)  0.9829
2021-12-31 10:47:11,105 BAD EPOCHS (no improvement): 1
2021-12-31 10:47:11,106 ----------------------------------------------------------------------------------------------------
2021-12-31 10:47:26,949 epoch 40 - iter 180/1807 - loss 0.05824543 - samples/sec: 91.06 - lr: 0.012500
2021-12-31 10:47:42,913 epoch 40 - iter 360/1807 - loss 0.05527233 - samples/sec: 90.33 - lr: 0.012500
2021-12-31 10:47:59,224 epoch 40 - iter 540/1807 - loss 0.05570769 - samples/sec: 88.41 - lr: 0.012500
2021-12-31 10:48:14,703 epoch 40 - iter 720/1807 - loss 0.05485811 - samples/sec: 93.17 - lr: 0.012500
2021-12-31 10:48:30,458 epoch 40 - iter 900/1807 - loss 0.05502772 - samples/sec: 91.54 - lr: 0.012500
2021-12-31 10:48:46,369 epoch 40 - iter 1080/1807 - loss 0.05487373 - samples/sec: 90.63 - lr: 0.012500
2021-12-31 10:49:01,734 epoch 40 - iter 1260/1807 - loss 0.05438047 - samples/sec: 93.85 - lr: 0.012500
2021-12-31 10:49:17,649 epoch 40 - iter 1440/1807 - loss 0.05459548 - samples/sec: 90.61 - lr: 0.012500
2021-12-31 10:49:33,390 epoch 40 - iter 1620/1807 - loss 0.05450567 - samples/sec: 91.62 - lr: 0.012500
2021-12-31 10:49:49,353 epoch 40 - iter 1800/1807 - loss 0.05462945 - samples/sec: 90.34 - lr: 0.012500
2021-12-31 10:49:49,959 ----------------------------------------------------------------------------------------------------
2021-12-31 10:49:49,959 EPOCH 40 done: loss 0.0546 - lr 0.0125000
2021-12-31 10:50:26,216 DEV : loss 0.06343018263578415 - f1-score (micro avg)  0.9829
2021-12-31 10:50:26,401 BAD EPOCHS (no improvement): 2
2021-12-31 10:50:26,402 ----------------------------------------------------------------------------------------------------
2021-12-31 10:50:42,801 epoch 41 - iter 180/1807 - loss 0.04923909 - samples/sec: 87.95 - lr: 0.012500
2021-12-31 10:50:58,898 epoch 41 - iter 360/1807 - loss 0.05125288 - samples/sec: 89.59 - lr: 0.012500
2021-12-31 10:51:14,501 epoch 41 - iter 540/1807 - loss 0.05242298 - samples/sec: 92.43 - lr: 0.012500
2021-12-31 10:51:30,244 epoch 41 - iter 720/1807 - loss 0.05272643 - samples/sec: 91.60 - lr: 0.012500
2021-12-31 10:51:46,266 epoch 41 - iter 900/1807 - loss 0.05277145 - samples/sec: 90.01 - lr: 0.012500
2021-12-31 10:52:02,535 epoch 41 - iter 1080/1807 - loss 0.05329680 - samples/sec: 88.64 - lr: 0.012500
2021-12-31 10:52:18,362 epoch 41 - iter 1260/1807 - loss 0.05349535 - samples/sec: 91.12 - lr: 0.012500
2021-12-31 10:52:34,324 epoch 41 - iter 1440/1807 - loss 0.05371268 - samples/sec: 90.35 - lr: 0.012500
2021-12-31 10:52:50,154 epoch 41 - iter 1620/1807 - loss 0.05362217 - samples/sec: 91.09 - lr: 0.012500
2021-12-31 10:53:06,114 epoch 41 - iter 1800/1807 - loss 0.05361560 - samples/sec: 90.36 - lr: 0.012500
2021-12-31 10:53:06,648 ----------------------------------------------------------------------------------------------------
2021-12-31 10:53:06,649 EPOCH 41 done: loss 0.0537 - lr 0.0125000
2021-12-31 10:53:42,920 DEV : loss 0.06420625746250153 - f1-score (micro avg)  0.9831
2021-12-31 10:53:43,107 BAD EPOCHS (no improvement): 3
2021-12-31 10:53:43,108 ----------------------------------------------------------------------------------------------------
2021-12-31 10:53:59,320 epoch 42 - iter 180/1807 - loss 0.04886676 - samples/sec: 88.96 - lr: 0.012500
2021-12-31 10:54:15,301 epoch 42 - iter 360/1807 - loss 0.05210812 - samples/sec: 90.24 - lr: 0.012500
2021-12-31 10:54:31,014 epoch 42 - iter 540/1807 - loss 0.05220145 - samples/sec: 91.78 - lr: 0.012500
2021-12-31 10:54:46,930 epoch 42 - iter 720/1807 - loss 0.05239133 - samples/sec: 90.61 - lr: 0.012500
2021-12-31 10:55:02,977 epoch 42 - iter 900/1807 - loss 0.05260141 - samples/sec: 89.87 - lr: 0.012500
2021-12-31 10:55:19,228 epoch 42 - iter 1080/1807 - loss 0.05260187 - samples/sec: 88.74 - lr: 0.012500
2021-12-31 10:55:35,215 epoch 42 - iter 1260/1807 - loss 0.05242910 - samples/sec: 90.21 - lr: 0.012500
2021-12-31 10:55:51,163 epoch 42 - iter 1440/1807 - loss 0.05265492 - samples/sec: 90.43 - lr: 0.012500
2021-12-31 10:56:07,328 epoch 42 - iter 1620/1807 - loss 0.05317972 - samples/sec: 89.21 - lr: 0.012500
2021-12-31 10:56:23,405 epoch 42 - iter 1800/1807 - loss 0.05319734 - samples/sec: 89.70 - lr: 0.012500
2021-12-31 10:56:23,951 ----------------------------------------------------------------------------------------------------
2021-12-31 10:56:23,951 EPOCH 42 done: loss 0.0532 - lr 0.0125000
2021-12-31 10:57:03,168 DEV : loss 0.06362675130367279 - f1-score (micro avg)  0.9831
2021-12-31 10:57:03,368 BAD EPOCHS (no improvement): 4
2021-12-31 10:57:03,370 ----------------------------------------------------------------------------------------------------
2021-12-31 10:57:19,009 epoch 43 - iter 180/1807 - loss 0.05496817 - samples/sec: 92.23 - lr: 0.006250
2021-12-31 10:57:34,952 epoch 43 - iter 360/1807 - loss 0.05262157 - samples/sec: 90.45 - lr: 0.006250
2021-12-31 10:57:51,104 epoch 43 - iter 540/1807 - loss 0.05252708 - samples/sec: 89.28 - lr: 0.006250
2021-12-31 10:58:06,630 epoch 43 - iter 720/1807 - loss 0.05258453 - samples/sec: 92.89 - lr: 0.006250
2021-12-31 10:58:22,297 epoch 43 - iter 900/1807 - loss 0.05170441 - samples/sec: 92.05 - lr: 0.006250
2021-12-31 10:58:38,636 epoch 43 - iter 1080/1807 - loss 0.05199907 - samples/sec: 88.26 - lr: 0.006250
2021-12-31 10:58:54,582 epoch 43 - iter 1260/1807 - loss 0.05289598 - samples/sec: 90.42 - lr: 0.006250
2021-12-31 10:59:10,756 epoch 43 - iter 1440/1807 - loss 0.05239565 - samples/sec: 89.17 - lr: 0.006250
2021-12-31 10:59:26,756 epoch 43 - iter 1620/1807 - loss 0.05245197 - samples/sec: 90.14 - lr: 0.006250
2021-12-31 10:59:43,140 epoch 43 - iter 1800/1807 - loss 0.05236153 - samples/sec: 88.01 - lr: 0.006250
2021-12-31 10:59:43,734 ----------------------------------------------------------------------------------------------------
2021-12-31 10:59:43,734 EPOCH 43 done: loss 0.0523 - lr 0.0062500
2021-12-31 11:00:19,875 DEV : loss 0.06449297815561295 - f1-score (micro avg)  0.983
2021-12-31 11:00:20,058 BAD EPOCHS (no improvement): 1
2021-12-31 11:00:20,060 ----------------------------------------------------------------------------------------------------
2021-12-31 11:00:36,054 epoch 44 - iter 180/1807 - loss 0.05668095 - samples/sec: 90.17 - lr: 0.006250
2021-12-31 11:00:51,879 epoch 44 - iter 360/1807 - loss 0.05376107 - samples/sec: 91.13 - lr: 0.006250
2021-12-31 11:01:07,774 epoch 44 - iter 540/1807 - loss 0.05410164 - samples/sec: 90.73 - lr: 0.006250
2021-12-31 11:01:23,539 epoch 44 - iter 720/1807 - loss 0.05349578 - samples/sec: 91.47 - lr: 0.006250
2021-12-31 11:01:39,511 epoch 44 - iter 900/1807 - loss 0.05316904 - samples/sec: 90.29 - lr: 0.006250
2021-12-31 11:01:55,495 epoch 44 - iter 1080/1807 - loss 0.05360298 - samples/sec: 90.23 - lr: 0.006250
2021-12-31 11:02:11,974 epoch 44 - iter 1260/1807 - loss 0.05360002 - samples/sec: 87.52 - lr: 0.006250
2021-12-31 11:02:27,697 epoch 44 - iter 1440/1807 - loss 0.05333331 - samples/sec: 91.72 - lr: 0.006250
2021-12-31 11:02:43,120 epoch 44 - iter 1620/1807 - loss 0.05286587 - samples/sec: 93.50 - lr: 0.006250
2021-12-31 11:02:58,798 epoch 44 - iter 1800/1807 - loss 0.05270956 - samples/sec: 91.99 - lr: 0.006250
2021-12-31 11:02:59,351 ----------------------------------------------------------------------------------------------------
2021-12-31 11:02:59,352 EPOCH 44 done: loss 0.0527 - lr 0.0062500
2021-12-31 11:03:35,832 DEV : loss 0.06455685943365097 - f1-score (micro avg)  0.9831
2021-12-31 11:03:36,019 BAD EPOCHS (no improvement): 2
2021-12-31 11:03:36,021 ----------------------------------------------------------------------------------------------------
2021-12-31 11:03:52,202 epoch 45 - iter 180/1807 - loss 0.05063292 - samples/sec: 89.13 - lr: 0.006250
2021-12-31 11:04:08,225 epoch 45 - iter 360/1807 - loss 0.05171673 - samples/sec: 90.00 - lr: 0.006250
2021-12-31 11:04:24,263 epoch 45 - iter 540/1807 - loss 0.05167432 - samples/sec: 89.93 - lr: 0.006250
2021-12-31 11:04:40,362 epoch 45 - iter 720/1807 - loss 0.05121190 - samples/sec: 89.58 - lr: 0.006250
2021-12-31 11:04:56,274 epoch 45 - iter 900/1807 - loss 0.05221446 - samples/sec: 90.63 - lr: 0.006250
2021-12-31 11:05:12,479 epoch 45 - iter 1080/1807 - loss 0.05188940 - samples/sec: 88.99 - lr: 0.006250
2021-12-31 11:05:28,572 epoch 45 - iter 1260/1807 - loss 0.05237022 - samples/sec: 89.62 - lr: 0.006250
2021-12-31 11:05:44,476 epoch 45 - iter 1440/1807 - loss 0.05180768 - samples/sec: 90.68 - lr: 0.006250
2021-12-31 11:06:00,356 epoch 45 - iter 1620/1807 - loss 0.05176296 - samples/sec: 90.81 - lr: 0.006250
2021-12-31 11:06:16,343 epoch 45 - iter 1800/1807 - loss 0.05236414 - samples/sec: 90.20 - lr: 0.006250
2021-12-31 11:06:16,948 ----------------------------------------------------------------------------------------------------
2021-12-31 11:06:16,949 EPOCH 45 done: loss 0.0523 - lr 0.0062500
2021-12-31 11:06:56,269 DEV : loss 0.06413871794939041 - f1-score (micro avg)  0.983
2021-12-31 11:06:56,425 BAD EPOCHS (no improvement): 3
2021-12-31 11:06:56,427 ----------------------------------------------------------------------------------------------------
2021-12-31 11:07:12,359 epoch 46 - iter 180/1807 - loss 0.04909660 - samples/sec: 90.52 - lr: 0.006250
2021-12-31 11:07:27,933 epoch 46 - iter 360/1807 - loss 0.04990439 - samples/sec: 92.58 - lr: 0.006250
2021-12-31 11:07:44,036 epoch 46 - iter 540/1807 - loss 0.05183261 - samples/sec: 89.55 - lr: 0.006250
2021-12-31 11:07:59,808 epoch 46 - iter 720/1807 - loss 0.05108367 - samples/sec: 91.44 - lr: 0.006250
2021-12-31 11:08:16,323 epoch 46 - iter 900/1807 - loss 0.05156129 - samples/sec: 87.33 - lr: 0.006250
2021-12-31 11:08:32,181 epoch 46 - iter 1080/1807 - loss 0.05164911 - samples/sec: 90.93 - lr: 0.006250
2021-12-31 11:08:48,124 epoch 46 - iter 1260/1807 - loss 0.05241189 - samples/sec: 90.45 - lr: 0.006250
2021-12-31 11:09:04,600 epoch 46 - iter 1440/1807 - loss 0.05209220 - samples/sec: 87.53 - lr: 0.006250
2021-12-31 11:09:20,227 epoch 46 - iter 1620/1807 - loss 0.05187081 - samples/sec: 92.29 - lr: 0.006250
2021-12-31 11:09:36,191 epoch 46 - iter 1800/1807 - loss 0.05205935 - samples/sec: 90.34 - lr: 0.006250
2021-12-31 11:09:36,782 ----------------------------------------------------------------------------------------------------
2021-12-31 11:09:36,782 EPOCH 46 done: loss 0.0521 - lr 0.0062500
2021-12-31 11:10:13,201 DEV : loss 0.0644669309258461 - f1-score (micro avg)  0.983
2021-12-31 11:10:13,398 BAD EPOCHS (no improvement): 4
2021-12-31 11:10:13,399 ----------------------------------------------------------------------------------------------------
2021-12-31 11:10:29,417 epoch 47 - iter 180/1807 - loss 0.05250873 - samples/sec: 90.04 - lr: 0.003125
2021-12-31 11:10:45,589 epoch 47 - iter 360/1807 - loss 0.05160928 - samples/sec: 89.18 - lr: 0.003125
2021-12-31 11:11:01,280 epoch 47 - iter 540/1807 - loss 0.05161492 - samples/sec: 91.91 - lr: 0.003125
2021-12-31 11:11:17,277 epoch 47 - iter 720/1807 - loss 0.05136337 - samples/sec: 90.15 - lr: 0.003125
2021-12-31 11:11:33,230 epoch 47 - iter 900/1807 - loss 0.05023989 - samples/sec: 90.40 - lr: 0.003125
2021-12-31 11:11:49,156 epoch 47 - iter 1080/1807 - loss 0.05064277 - samples/sec: 90.55 - lr: 0.003125
2021-12-31 11:12:04,959 epoch 47 - iter 1260/1807 - loss 0.05089925 - samples/sec: 91.25 - lr: 0.003125
2021-12-31 11:12:21,092 epoch 47 - iter 1440/1807 - loss 0.05071923 - samples/sec: 89.39 - lr: 0.003125
2021-12-31 11:12:36,949 epoch 47 - iter 1620/1807 - loss 0.05083516 - samples/sec: 90.95 - lr: 0.003125
2021-12-31 11:12:52,744 epoch 47 - iter 1800/1807 - loss 0.05106443 - samples/sec: 91.31 - lr: 0.003125
2021-12-31 11:12:53,321 ----------------------------------------------------------------------------------------------------
2021-12-31 11:12:53,321 EPOCH 47 done: loss 0.0511 - lr 0.0031250
2021-12-31 11:13:29,490 DEV : loss 0.06470787525177002 - f1-score (micro avg)  0.9829
2021-12-31 11:13:29,672 BAD EPOCHS (no improvement): 1
2021-12-31 11:13:29,674 ----------------------------------------------------------------------------------------------------
2021-12-31 11:13:45,987 epoch 48 - iter 180/1807 - loss 0.05119727 - samples/sec: 88.41 - lr: 0.003125
2021-12-31 11:14:02,271 epoch 48 - iter 360/1807 - loss 0.05026057 - samples/sec: 88.57 - lr: 0.003125
2021-12-31 11:14:18,202 epoch 48 - iter 540/1807 - loss 0.04968790 - samples/sec: 90.53 - lr: 0.003125
2021-12-31 11:14:33,834 epoch 48 - iter 720/1807 - loss 0.05040465 - samples/sec: 92.25 - lr: 0.003125
2021-12-31 11:14:49,709 epoch 48 - iter 900/1807 - loss 0.05065504 - samples/sec: 90.84 - lr: 0.003125
2021-12-31 11:15:05,727 epoch 48 - iter 1080/1807 - loss 0.05037297 - samples/sec: 90.02 - lr: 0.003125
2021-12-31 11:15:21,077 epoch 48 - iter 1260/1807 - loss 0.05063199 - samples/sec: 93.96 - lr: 0.003125
2021-12-31 11:15:36,587 epoch 48 - iter 1440/1807 - loss 0.05076731 - samples/sec: 92.98 - lr: 0.003125
2021-12-31 11:15:52,489 epoch 48 - iter 1620/1807 - loss 0.05082260 - samples/sec: 90.68 - lr: 0.003125
2021-12-31 11:16:08,520 epoch 48 - iter 1800/1807 - loss 0.05101165 - samples/sec: 89.96 - lr: 0.003125
2021-12-31 11:16:09,115 ----------------------------------------------------------------------------------------------------
2021-12-31 11:16:09,116 EPOCH 48 done: loss 0.0510 - lr 0.0031250
2021-12-31 11:16:48,035 DEV : loss 0.06484530121088028 - f1-score (micro avg)  0.983
2021-12-31 11:16:48,189 BAD EPOCHS (no improvement): 2
2021-12-31 11:16:48,191 ----------------------------------------------------------------------------------------------------
2021-12-31 11:17:03,775 epoch 49 - iter 180/1807 - loss 0.04706234 - samples/sec: 92.51 - lr: 0.003125
2021-12-31 11:17:19,604 epoch 49 - iter 360/1807 - loss 0.04796051 - samples/sec: 91.07 - lr: 0.003125
2021-12-31 11:17:35,506 epoch 49 - iter 540/1807 - loss 0.04820802 - samples/sec: 90.67 - lr: 0.003125
2021-12-31 11:17:51,301 epoch 49 - iter 720/1807 - loss 0.04872061 - samples/sec: 91.31 - lr: 0.003125
2021-12-31 11:18:06,963 epoch 49 - iter 900/1807 - loss 0.04900955 - samples/sec: 92.08 - lr: 0.003125
2021-12-31 11:18:22,961 epoch 49 - iter 1080/1807 - loss 0.04952427 - samples/sec: 90.14 - lr: 0.003125
2021-12-31 11:18:39,172 epoch 49 - iter 1260/1807 - loss 0.04981242 - samples/sec: 88.96 - lr: 0.003125
2021-12-31 11:18:55,485 epoch 49 - iter 1440/1807 - loss 0.05015633 - samples/sec: 88.41 - lr: 0.003125
2021-12-31 11:19:11,166 epoch 49 - iter 1620/1807 - loss 0.05076498 - samples/sec: 91.97 - lr: 0.003125
2021-12-31 11:19:27,065 epoch 49 - iter 1800/1807 - loss 0.05104387 - samples/sec: 90.71 - lr: 0.003125
2021-12-31 11:19:27,675 ----------------------------------------------------------------------------------------------------
2021-12-31 11:19:27,675 EPOCH 49 done: loss 0.0510 - lr 0.0031250
2021-12-31 11:20:04,021 DEV : loss 0.06486314535140991 - f1-score (micro avg)  0.983
2021-12-31 11:20:04,217 BAD EPOCHS (no improvement): 3
2021-12-31 11:20:04,218 ----------------------------------------------------------------------------------------------------
2021-12-31 11:20:20,650 epoch 50 - iter 180/1807 - loss 0.05726933 - samples/sec: 87.77 - lr: 0.003125
2021-12-31 11:20:36,455 epoch 50 - iter 360/1807 - loss 0.05538766 - samples/sec: 91.25 - lr: 0.003125
2021-12-31 11:20:52,012 epoch 50 - iter 540/1807 - loss 0.05444601 - samples/sec: 92.69 - lr: 0.003125
2021-12-31 11:21:07,973 epoch 50 - iter 720/1807 - loss 0.05313637 - samples/sec: 90.35 - lr: 0.003125
2021-12-31 11:21:23,983 epoch 50 - iter 900/1807 - loss 0.05290526 - samples/sec: 90.08 - lr: 0.003125
2021-12-31 11:21:39,924 epoch 50 - iter 1080/1807 - loss 0.05235234 - samples/sec: 90.47 - lr: 0.003125
2021-12-31 11:21:55,732 epoch 50 - iter 1260/1807 - loss 0.05207690 - samples/sec: 91.23 - lr: 0.003125
2021-12-31 11:22:11,663 epoch 50 - iter 1440/1807 - loss 0.05205514 - samples/sec: 90.52 - lr: 0.003125
2021-12-31 11:22:27,392 epoch 50 - iter 1620/1807 - loss 0.05173851 - samples/sec: 91.69 - lr: 0.003125
2021-12-31 11:22:43,193 epoch 50 - iter 1800/1807 - loss 0.05189058 - samples/sec: 91.27 - lr: 0.003125
2021-12-31 11:22:43,750 ----------------------------------------------------------------------------------------------------
2021-12-31 11:22:43,750 EPOCH 50 done: loss 0.0519 - lr 0.0031250
2021-12-31 11:23:20,432 DEV : loss 0.06452730298042297 - f1-score (micro avg)  0.9831
2021-12-31 11:23:20,619 BAD EPOCHS (no improvement): 4
2021-12-31 11:23:25,890 ----------------------------------------------------------------------------------------------------
2021-12-31 11:23:25,893 loading file models/UPOS_UD_FRENCH_GSD_PLUS_Flair-Embeddings_50_2021-12-31-08:34:44/best-model.pt
2021-12-31 11:23:43,354 0.9797	0.9797	0.9797	0.9797
2021-12-31 11:23:43,354 
Results:
- F-score (micro) 0.9797
- F-score (macro) 0.9178
- Accuracy 0.9797

By class:
              precision    recall  f1-score   support

        PREP     0.9966    0.9987    0.9976      1483
       PUNCT     1.0000    1.0000    1.0000       833
         NMS     0.9634    0.9801    0.9717       753
         DET     0.9923    0.9984    0.9954       645
        VERB     0.9913    0.9811    0.9862       583
         NFS     0.9667    0.9839    0.9752       560
         ADV     0.9940    0.9821    0.9880       504
       PROPN     0.9541    0.8937    0.9229       395
       DETMS     1.0000    1.0000    1.0000       362
         AUX     0.9860    0.9915    0.9888       355
       YPFOR     1.0000    1.0000    1.0000       353
         NMP     0.9666    0.9475    0.9570       305
        COCO     0.9959    1.0000    0.9980       245
       ADJMS     0.9463    0.9385    0.9424       244
       DETFS     1.0000    1.0000    1.0000       240
        CHIF     0.9648    0.9865    0.9755       222
         NFP     0.9515    0.9849    0.9679       199
       ADJFS     0.9657    0.9286    0.9468       182
       VPPMS     0.9387    0.9745    0.9563       157
       COSUB     1.0000    0.9844    0.9921       128
      DINTMS     0.9918    0.9918    0.9918       122
      XFAMIL     0.9298    0.9217    0.9258       115
     PPER3MS     1.0000    1.0000    1.0000        87
       ADJMP     0.9294    0.9634    0.9461        82
      PDEMMS     1.0000    1.0000    1.0000        75
       ADJFP     0.9861    0.9342    0.9595        76
        PREL     0.9859    1.0000    0.9929        70
      DINTFS     0.9839    1.0000    0.9919        61
        PREF     1.0000    1.0000    1.0000        52
     PPOBJMS     0.9565    0.9362    0.9462        47
       PREFP     0.9778    1.0000    0.9888        44
      PINDMS     1.0000    0.9773    0.9885        44
       VPPFS     0.8298    0.9750    0.8966        40
      PPER1S     1.0000    1.0000    1.0000        42
         SYM     1.0000    0.9474    0.9730        38
        NOUN     0.8824    0.7692    0.8219        39
        PRON     1.0000    0.9677    0.9836        31
      PDEMFS     1.0000    1.0000    1.0000        29
       VPPMP     0.9286    1.0000    0.9630        26
         ADJ     0.9524    0.9091    0.9302        22
     PPER3MP     1.0000    1.0000    1.0000        20
       VPPFP     1.0000    1.0000    1.0000        19
     PPER3FS     1.0000    1.0000    1.0000        18
      MOTINC     0.3333    0.4000    0.3636        15
       PREFS     1.0000    1.0000    1.0000        10
     PPOBJMP     1.0000    0.8000    0.8889        10
     PPOBJFS     0.6250    0.8333    0.7143         6
        INTJ     0.5000    0.6667    0.5714         6
        PART     1.0000    1.0000    1.0000         4
      PDEMMP     1.0000    1.0000    1.0000         3
      PDEMFP     1.0000    1.0000    1.0000         3
     PPER3FP     1.0000    1.0000    1.0000         2
         NUM     1.0000    0.3333    0.5000         3
      PPER2S     1.0000    1.0000    1.0000         2
     PPOBJFP     0.5000    0.5000    0.5000         2
      PRELMS     1.0000    1.0000    1.0000         2
      PINDFS     0.5000    1.0000    0.6667         1
      PINDMP     1.0000    1.0000    1.0000         1
           X     0.0000    0.0000    0.0000         1
      PINDFP     1.0000    1.0000    1.0000         1

   micro avg     0.9797    0.9797    0.9797     10019
   macro avg     0.9228    0.9230    0.9178     10019
weighted avg     0.9802    0.9797    0.9798     10019
 samples avg     0.9797    0.9797    0.9797     10019

2021-12-31 11:23:43,354 ----------------------------------------------------------------------------------------------------