hts98 commited on
Commit
826fdbd
·
1 Parent(s): 38884bf

End of training

Browse files
all_results.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 120.0,
3
+ "eval_accuracy": 0.8276281577252451,
4
+ "eval_f1": 0.6784613322610911,
5
+ "eval_loss": 1.6160385608673096,
6
+ "eval_precision": 0.6524877545759217,
7
+ "eval_recall": 0.7065884980457845,
8
+ "eval_runtime": 2.9494,
9
+ "eval_samples": 1112,
10
+ "eval_samples_per_second": 377.029,
11
+ "eval_steps_per_second": 11.867,
12
+ "predict_accuracy": 0.8316702819956616,
13
+ "predict_f1": 0.6882613133718952,
14
+ "predict_loss": 1.5868676900863647,
15
+ "predict_precision": 0.6608729743857815,
16
+ "predict_recall": 0.718017890103649,
17
+ "predict_runtime": 5.983,
18
+ "predict_samples_per_second": 371.884,
19
+ "predict_steps_per_second": 11.7,
20
+ "train_loss": 0.045685164459416124,
21
+ "train_runtime": 6878.0479,
22
+ "train_samples": 7785,
23
+ "train_samples_per_second": 135.823,
24
+ "train_steps_per_second": 4.257
25
+ }
eval_results.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 120.0,
3
+ "eval_accuracy": 0.8276281577252451,
4
+ "eval_f1": 0.6784613322610911,
5
+ "eval_loss": 1.6160385608673096,
6
+ "eval_precision": 0.6524877545759217,
7
+ "eval_recall": 0.7065884980457845,
8
+ "eval_runtime": 2.9494,
9
+ "eval_samples": 1112,
10
+ "eval_samples_per_second": 377.029,
11
+ "eval_steps_per_second": 11.867
12
+ }
predict_results.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "predict_accuracy": 0.8316702819956616,
3
+ "predict_f1": 0.6882613133718952,
4
+ "predict_loss": 1.5868676900863647,
5
+ "predict_precision": 0.6608729743857815,
6
+ "predict_recall": 0.718017890103649,
7
+ "predict_runtime": 5.983,
8
+ "predict_samples_per_second": 371.884,
9
+ "predict_steps_per_second": 11.7
10
+ }
predictions.txt ADDED
The diff for this file is too large to render. See raw diff
 
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 120.0,
3
+ "train_loss": 0.045685164459416124,
4
+ "train_runtime": 6878.0479,
5
+ "train_samples": 7785,
6
+ "train_samples_per_second": 135.823,
7
+ "train_steps_per_second": 4.257
8
+ }
trainer_state.json ADDED
@@ -0,0 +1,1813 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "best_metric": 0.6784613322610911,
3
+ "best_model_checkpoint": "/tmp/test-ner1_base/checkpoint-26840",
4
+ "epoch": 120.0,
5
+ "global_step": 29280,
6
+ "is_hyper_param_search": false,
7
+ "is_local_process_zero": true,
8
+ "is_world_process_zero": true,
9
+ "log_history": [
10
+ {
11
+ "epoch": 1.0,
12
+ "eval_accuracy": 0.7635791130936762,
13
+ "eval_f1": 0.4809298946603705,
14
+ "eval_loss": 0.8214002847671509,
15
+ "eval_precision": 0.4246311738293778,
16
+ "eval_recall": 0.5544388609715243,
17
+ "eval_runtime": 2.9656,
18
+ "eval_samples_per_second": 374.972,
19
+ "eval_steps_per_second": 11.802,
20
+ "step": 244
21
+ },
22
+ {
23
+ "epoch": 2.0,
24
+ "eval_accuracy": 0.8022709381932683,
25
+ "eval_f1": 0.5700258397932816,
26
+ "eval_loss": 0.6734200119972229,
27
+ "eval_precision": 0.5305435305435305,
28
+ "eval_recall": 0.615857063093244,
29
+ "eval_runtime": 2.95,
30
+ "eval_samples_per_second": 376.943,
31
+ "eval_steps_per_second": 11.864,
32
+ "step": 488
33
+ },
34
+ {
35
+ "epoch": 2.05,
36
+ "learning_rate": 2.9487704918032787e-05,
37
+ "loss": 0.9764,
38
+ "step": 500
39
+ },
40
+ {
41
+ "epoch": 3.0,
42
+ "eval_accuracy": 0.8072087975000596,
43
+ "eval_f1": 0.6019592678525393,
44
+ "eval_loss": 0.6425491571426392,
45
+ "eval_precision": 0.5591475095785441,
46
+ "eval_recall": 0.6518704634282524,
47
+ "eval_runtime": 2.9986,
48
+ "eval_samples_per_second": 370.843,
49
+ "eval_steps_per_second": 11.672,
50
+ "step": 732
51
+ },
52
+ {
53
+ "epoch": 4.0,
54
+ "eval_accuracy": 0.8214737243863457,
55
+ "eval_f1": 0.6151294808011226,
56
+ "eval_loss": 0.6202793121337891,
57
+ "eval_precision": 0.5663612872915198,
58
+ "eval_recall": 0.6730876605248465,
59
+ "eval_runtime": 2.9638,
60
+ "eval_samples_per_second": 375.198,
61
+ "eval_steps_per_second": 11.809,
62
+ "step": 976
63
+ },
64
+ {
65
+ "epoch": 4.1,
66
+ "learning_rate": 2.8975409836065577e-05,
67
+ "loss": 0.4504,
68
+ "step": 1000
69
+ },
70
+ {
71
+ "epoch": 5.0,
72
+ "eval_accuracy": 0.8170606617208559,
73
+ "eval_f1": 0.6304066304066305,
74
+ "eval_loss": 0.6483346223831177,
75
+ "eval_precision": 0.5879227053140097,
76
+ "eval_recall": 0.6795086543830262,
77
+ "eval_runtime": 2.9472,
78
+ "eval_samples_per_second": 377.302,
79
+ "eval_steps_per_second": 11.876,
80
+ "step": 1220
81
+ },
82
+ {
83
+ "epoch": 6.0,
84
+ "eval_accuracy": 0.8137687555163283,
85
+ "eval_f1": 0.607830950901181,
86
+ "eval_loss": 0.6827735900878906,
87
+ "eval_precision": 0.5478377772798566,
88
+ "eval_recall": 0.6825795644891123,
89
+ "eval_runtime": 2.9631,
90
+ "eval_samples_per_second": 375.283,
91
+ "eval_steps_per_second": 11.812,
92
+ "step": 1464
93
+ },
94
+ {
95
+ "epoch": 6.15,
96
+ "learning_rate": 2.846311475409836e-05,
97
+ "loss": 0.2877,
98
+ "step": 1500
99
+ },
100
+ {
101
+ "epoch": 7.0,
102
+ "eval_accuracy": 0.8115025882016174,
103
+ "eval_f1": 0.6301228183581126,
104
+ "eval_loss": 0.709669828414917,
105
+ "eval_precision": 0.5868047194798941,
106
+ "eval_recall": 0.6803461753210497,
107
+ "eval_runtime": 2.9712,
108
+ "eval_samples_per_second": 374.266,
109
+ "eval_steps_per_second": 11.78,
110
+ "step": 1708
111
+ },
112
+ {
113
+ "epoch": 8.0,
114
+ "eval_accuracy": 0.8133393764461726,
115
+ "eval_f1": 0.6333632978040324,
116
+ "eval_loss": 0.7538309097290039,
117
+ "eval_precision": 0.5864447086801426,
118
+ "eval_recall": 0.6884422110552764,
119
+ "eval_runtime": 2.99,
120
+ "eval_samples_per_second": 371.902,
121
+ "eval_steps_per_second": 11.706,
122
+ "step": 1952
123
+ },
124
+ {
125
+ "epoch": 8.2,
126
+ "learning_rate": 2.795081967213115e-05,
127
+ "loss": 0.1968,
128
+ "step": 2000
129
+ },
130
+ {
131
+ "epoch": 9.0,
132
+ "eval_accuracy": 0.812862288590444,
133
+ "eval_f1": 0.6361865177295752,
134
+ "eval_loss": 0.7852667570114136,
135
+ "eval_precision": 0.594850619383046,
136
+ "eval_recall": 0.6836962590731435,
137
+ "eval_runtime": 2.9647,
138
+ "eval_samples_per_second": 375.08,
139
+ "eval_steps_per_second": 11.806,
140
+ "step": 2196
141
+ },
142
+ {
143
+ "epoch": 10.0,
144
+ "eval_accuracy": 0.8093556928508385,
145
+ "eval_f1": 0.6372536222425271,
146
+ "eval_loss": 0.8311049938201904,
147
+ "eval_precision": 0.5984309879872518,
148
+ "eval_recall": 0.681462869905081,
149
+ "eval_runtime": 3.0899,
150
+ "eval_samples_per_second": 359.881,
151
+ "eval_steps_per_second": 11.327,
152
+ "step": 2440
153
+ },
154
+ {
155
+ "epoch": 10.25,
156
+ "learning_rate": 2.7438524590163935e-05,
157
+ "loss": 0.1443,
158
+ "step": 2500
159
+ },
160
+ {
161
+ "epoch": 11.0,
162
+ "eval_accuracy": 0.8189928675365569,
163
+ "eval_f1": 0.6475366876310273,
164
+ "eval_loss": 0.790969967842102,
165
+ "eval_precision": 0.6101234567901235,
166
+ "eval_recall": 0.6898380792853155,
167
+ "eval_runtime": 2.9631,
168
+ "eval_samples_per_second": 375.284,
169
+ "eval_steps_per_second": 11.812,
170
+ "step": 2684
171
+ },
172
+ {
173
+ "epoch": 12.0,
174
+ "eval_accuracy": 0.8147229312277856,
175
+ "eval_f1": 0.6334806342604627,
176
+ "eval_loss": 0.8414269089698792,
177
+ "eval_precision": 0.5926556420233463,
178
+ "eval_recall": 0.6803461753210497,
179
+ "eval_runtime": 2.9571,
180
+ "eval_samples_per_second": 376.042,
181
+ "eval_steps_per_second": 11.836,
182
+ "step": 2928
183
+ },
184
+ {
185
+ "epoch": 12.3,
186
+ "learning_rate": 2.6926229508196725e-05,
187
+ "loss": 0.1118,
188
+ "step": 3000
189
+ },
190
+ {
191
+ "epoch": 13.0,
192
+ "eval_accuracy": 0.8069225447866225,
193
+ "eval_f1": 0.6412532637075719,
194
+ "eval_loss": 0.8946433067321777,
195
+ "eval_precision": 0.6022560078469839,
196
+ "eval_recall": 0.6856504745951982,
197
+ "eval_runtime": 2.9859,
198
+ "eval_samples_per_second": 372.42,
199
+ "eval_steps_per_second": 11.722,
200
+ "step": 3172
201
+ },
202
+ {
203
+ "epoch": 14.0,
204
+ "eval_accuracy": 0.8129815605543761,
205
+ "eval_f1": 0.6424639580602883,
206
+ "eval_loss": 0.9194995760917664,
207
+ "eval_precision": 0.6054841897233202,
208
+ "eval_recall": 0.6842546063651591,
209
+ "eval_runtime": 2.9708,
210
+ "eval_samples_per_second": 374.309,
211
+ "eval_steps_per_second": 11.781,
212
+ "step": 3416
213
+ },
214
+ {
215
+ "epoch": 14.34,
216
+ "learning_rate": 2.6413934426229508e-05,
217
+ "loss": 0.0838,
218
+ "step": 3500
219
+ },
220
+ {
221
+ "epoch": 15.0,
222
+ "eval_accuracy": 0.8192314114644211,
223
+ "eval_f1": 0.6421779340888367,
224
+ "eval_loss": 0.9149069786071777,
225
+ "eval_precision": 0.6019536019536019,
226
+ "eval_recall": 0.6881630374092685,
227
+ "eval_runtime": 2.9929,
228
+ "eval_samples_per_second": 371.552,
229
+ "eval_steps_per_second": 11.695,
230
+ "step": 3660
231
+ },
232
+ {
233
+ "epoch": 16.0,
234
+ "eval_accuracy": 0.8201140239975191,
235
+ "eval_f1": 0.6456589958158997,
236
+ "eval_loss": 0.9356908798217773,
237
+ "eval_precision": 0.6072306935563208,
238
+ "eval_recall": 0.6892797319932998,
239
+ "eval_runtime": 2.9882,
240
+ "eval_samples_per_second": 372.132,
241
+ "eval_steps_per_second": 11.713,
242
+ "step": 3904
243
+ },
244
+ {
245
+ "epoch": 16.39,
246
+ "learning_rate": 2.5901639344262294e-05,
247
+ "loss": 0.0661,
248
+ "step": 4000
249
+ },
250
+ {
251
+ "epoch": 17.0,
252
+ "eval_accuracy": 0.8172514968631474,
253
+ "eval_f1": 0.6432016686220832,
254
+ "eval_loss": 0.9784498810768127,
255
+ "eval_precision": 0.60332599657618,
256
+ "eval_recall": 0.6887213847012842,
257
+ "eval_runtime": 2.968,
258
+ "eval_samples_per_second": 374.663,
259
+ "eval_steps_per_second": 11.792,
260
+ "step": 4148
261
+ },
262
+ {
263
+ "epoch": 18.0,
264
+ "eval_accuracy": 0.8184442165024689,
265
+ "eval_f1": 0.6472832522036575,
266
+ "eval_loss": 0.9842237234115601,
267
+ "eval_precision": 0.6120925603383927,
268
+ "eval_recall": 0.6867671691792295,
269
+ "eval_runtime": 2.981,
270
+ "eval_samples_per_second": 373.035,
271
+ "eval_steps_per_second": 11.741,
272
+ "step": 4392
273
+ },
274
+ {
275
+ "epoch": 18.44,
276
+ "learning_rate": 2.5389344262295083e-05,
277
+ "loss": 0.0514,
278
+ "step": 4500
279
+ },
280
+ {
281
+ "epoch": 19.0,
282
+ "eval_accuracy": 0.8163688843300494,
283
+ "eval_f1": 0.6476140534871526,
284
+ "eval_loss": 1.0097302198410034,
285
+ "eval_precision": 0.6104794859120118,
286
+ "eval_recall": 0.6895589056393077,
287
+ "eval_runtime": 2.9667,
288
+ "eval_samples_per_second": 374.823,
289
+ "eval_steps_per_second": 11.797,
290
+ "step": 4636
291
+ },
292
+ {
293
+ "epoch": 20.0,
294
+ "eval_accuracy": 0.8167028458290594,
295
+ "eval_f1": 0.648761408083442,
296
+ "eval_loss": 1.0300242900848389,
297
+ "eval_precision": 0.6086105675146771,
298
+ "eval_recall": 0.6945840312674484,
299
+ "eval_runtime": 3.032,
300
+ "eval_samples_per_second": 366.754,
301
+ "eval_steps_per_second": 11.544,
302
+ "step": 4880
303
+ },
304
+ {
305
+ "epoch": 20.49,
306
+ "learning_rate": 2.487704918032787e-05,
307
+ "loss": 0.0416,
308
+ "step": 5000
309
+ },
310
+ {
311
+ "epoch": 21.0,
312
+ "eval_accuracy": 0.8204718398893156,
313
+ "eval_f1": 0.652764946548766,
314
+ "eval_loss": 1.0250210762023926,
315
+ "eval_precision": 0.6190237797246558,
316
+ "eval_recall": 0.6903964265773311,
317
+ "eval_runtime": 2.9697,
318
+ "eval_samples_per_second": 374.455,
319
+ "eval_steps_per_second": 11.786,
320
+ "step": 5124
321
+ },
322
+ {
323
+ "epoch": 22.0,
324
+ "eval_accuracy": 0.8167267002218459,
325
+ "eval_f1": 0.6531738730450781,
326
+ "eval_loss": 1.087920069694519,
327
+ "eval_precision": 0.6170846784206605,
328
+ "eval_recall": 0.6937465103294249,
329
+ "eval_runtime": 2.983,
330
+ "eval_samples_per_second": 372.773,
331
+ "eval_steps_per_second": 11.733,
332
+ "step": 5368
333
+ },
334
+ {
335
+ "epoch": 22.54,
336
+ "learning_rate": 2.436475409836066e-05,
337
+ "loss": 0.0324,
338
+ "step": 5500
339
+ },
340
+ {
341
+ "epoch": 23.0,
342
+ "eval_accuracy": 0.8127191622337253,
343
+ "eval_f1": 0.6434782608695652,
344
+ "eval_loss": 1.134914755821228,
345
+ "eval_precision": 0.6092814371257484,
346
+ "eval_recall": 0.6817420435510888,
347
+ "eval_runtime": 2.9583,
348
+ "eval_samples_per_second": 375.891,
349
+ "eval_steps_per_second": 11.831,
350
+ "step": 5612
351
+ },
352
+ {
353
+ "epoch": 24.0,
354
+ "eval_accuracy": 0.8181102550034589,
355
+ "eval_f1": 0.6542006847511194,
356
+ "eval_loss": 1.0993841886520386,
357
+ "eval_precision": 0.6191425722831505,
358
+ "eval_recall": 0.6934673366834171,
359
+ "eval_runtime": 2.985,
360
+ "eval_samples_per_second": 372.527,
361
+ "eval_steps_per_second": 11.725,
362
+ "step": 5856
363
+ },
364
+ {
365
+ "epoch": 24.59,
366
+ "learning_rate": 2.3852459016393442e-05,
367
+ "loss": 0.0277,
368
+ "step": 6000
369
+ },
370
+ {
371
+ "epoch": 25.0,
372
+ "eval_accuracy": 0.8152715822618736,
373
+ "eval_f1": 0.6543161214032323,
374
+ "eval_loss": 1.1400500535964966,
375
+ "eval_precision": 0.6180193596425912,
376
+ "eval_recall": 0.695142378559464,
377
+ "eval_runtime": 3.0938,
378
+ "eval_samples_per_second": 359.431,
379
+ "eval_steps_per_second": 11.313,
380
+ "step": 6100
381
+ },
382
+ {
383
+ "epoch": 26.0,
384
+ "eval_accuracy": 0.808449225924954,
385
+ "eval_f1": 0.6410887880751782,
386
+ "eval_loss": 1.1867998838424683,
387
+ "eval_precision": 0.5983547060246794,
388
+ "eval_recall": 0.6903964265773311,
389
+ "eval_runtime": 2.9903,
390
+ "eval_samples_per_second": 371.869,
391
+ "eval_steps_per_second": 11.704,
392
+ "step": 6344
393
+ },
394
+ {
395
+ "epoch": 26.64,
396
+ "learning_rate": 2.3340163934426228e-05,
397
+ "loss": 0.0223,
398
+ "step": 6500
399
+ },
400
+ {
401
+ "epoch": 27.0,
402
+ "eval_accuracy": 0.8138880274802605,
403
+ "eval_f1": 0.6558120912851995,
404
+ "eval_loss": 1.205185890197754,
405
+ "eval_precision": 0.628228074661212,
406
+ "eval_recall": 0.6859296482412061,
407
+ "eval_runtime": 2.9746,
408
+ "eval_samples_per_second": 373.828,
409
+ "eval_steps_per_second": 11.766,
410
+ "step": 6588
411
+ },
412
+ {
413
+ "epoch": 28.0,
414
+ "eval_accuracy": 0.81529543665466,
415
+ "eval_f1": 0.6529110264160862,
416
+ "eval_loss": 1.1963977813720703,
417
+ "eval_precision": 0.6168363546064067,
418
+ "eval_recall": 0.6934673366834171,
419
+ "eval_runtime": 2.9694,
420
+ "eval_samples_per_second": 374.481,
421
+ "eval_steps_per_second": 11.787,
422
+ "step": 6832
423
+ },
424
+ {
425
+ "epoch": 28.69,
426
+ "learning_rate": 2.2827868852459018e-05,
427
+ "loss": 0.019,
428
+ "step": 7000
429
+ },
430
+ {
431
+ "epoch": 29.0,
432
+ "eval_accuracy": 0.820161732783092,
433
+ "eval_f1": 0.6533368644067797,
434
+ "eval_loss": 1.1897813081741333,
435
+ "eval_precision": 0.6214105793450881,
436
+ "eval_recall": 0.6887213847012842,
437
+ "eval_runtime": 2.9761,
438
+ "eval_samples_per_second": 373.647,
439
+ "eval_steps_per_second": 11.76,
440
+ "step": 7076
441
+ },
442
+ {
443
+ "epoch": 30.0,
444
+ "eval_accuracy": 0.8134586484101047,
445
+ "eval_f1": 0.6515647505565013,
446
+ "eval_loss": 1.2819464206695557,
447
+ "eval_precision": 0.6135635018495684,
448
+ "eval_recall": 0.6945840312674484,
449
+ "eval_runtime": 2.9771,
450
+ "eval_samples_per_second": 373.523,
451
+ "eval_steps_per_second": 11.757,
452
+ "step": 7320
453
+ },
454
+ {
455
+ "epoch": 30.74,
456
+ "learning_rate": 2.2315573770491804e-05,
457
+ "loss": 0.0159,
458
+ "step": 7500
459
+ },
460
+ {
461
+ "epoch": 31.0,
462
+ "eval_accuracy": 0.8127907254120846,
463
+ "eval_f1": 0.6476589066178393,
464
+ "eval_loss": 1.2686526775360107,
465
+ "eval_precision": 0.609251968503937,
466
+ "eval_recall": 0.6912339475153545,
467
+ "eval_runtime": 3.0513,
468
+ "eval_samples_per_second": 364.436,
469
+ "eval_steps_per_second": 11.471,
470
+ "step": 7564
471
+ },
472
+ {
473
+ "epoch": 32.0,
474
+ "eval_accuracy": 0.8158917964743208,
475
+ "eval_f1": 0.6522136607026251,
476
+ "eval_loss": 1.2997089624404907,
477
+ "eval_precision": 0.612760736196319,
478
+ "eval_recall": 0.6970965940815187,
479
+ "eval_runtime": 3.0022,
480
+ "eval_samples_per_second": 370.396,
481
+ "eval_steps_per_second": 11.658,
482
+ "step": 7808
483
+ },
484
+ {
485
+ "epoch": 32.79,
486
+ "learning_rate": 2.180327868852459e-05,
487
+ "loss": 0.0141,
488
+ "step": 8000
489
+ },
490
+ {
491
+ "epoch": 33.0,
492
+ "eval_accuracy": 0.8157248157248157,
493
+ "eval_f1": 0.6490136369654441,
494
+ "eval_loss": 1.2799758911132812,
495
+ "eval_precision": 0.6172248803827751,
496
+ "eval_recall": 0.6842546063651591,
497
+ "eval_runtime": 2.9595,
498
+ "eval_samples_per_second": 375.745,
499
+ "eval_steps_per_second": 11.827,
500
+ "step": 8052
501
+ },
502
+ {
503
+ "epoch": 34.0,
504
+ "eval_accuracy": 0.8141265714081248,
505
+ "eval_f1": 0.6543144520910535,
506
+ "eval_loss": 1.3110435009002686,
507
+ "eval_precision": 0.6220432813286362,
508
+ "eval_recall": 0.6901172529313233,
509
+ "eval_runtime": 2.9482,
510
+ "eval_samples_per_second": 377.185,
511
+ "eval_steps_per_second": 11.872,
512
+ "step": 8296
513
+ },
514
+ {
515
+ "epoch": 34.84,
516
+ "learning_rate": 2.1290983606557376e-05,
517
+ "loss": 0.0107,
518
+ "step": 8500
519
+ },
520
+ {
521
+ "epoch": 35.0,
522
+ "eval_accuracy": 0.8159633596526801,
523
+ "eval_f1": 0.6468897020386828,
524
+ "eval_loss": 1.3342597484588623,
525
+ "eval_precision": 0.6081081081081081,
526
+ "eval_recall": 0.6909547738693468,
527
+ "eval_runtime": 2.9541,
528
+ "eval_samples_per_second": 376.43,
529
+ "eval_steps_per_second": 11.848,
530
+ "step": 8540
531
+ },
532
+ {
533
+ "epoch": 36.0,
534
+ "eval_accuracy": 0.8130292693399489,
535
+ "eval_f1": 0.6493064642763675,
536
+ "eval_loss": 1.3406134843826294,
537
+ "eval_precision": 0.6110837438423645,
538
+ "eval_recall": 0.6926298157453936,
539
+ "eval_runtime": 2.9685,
540
+ "eval_samples_per_second": 374.606,
541
+ "eval_steps_per_second": 11.791,
542
+ "step": 8784
543
+ },
544
+ {
545
+ "epoch": 36.89,
546
+ "learning_rate": 2.0778688524590166e-05,
547
+ "loss": 0.0106,
548
+ "step": 9000
549
+ },
550
+ {
551
+ "epoch": 37.0,
552
+ "eval_accuracy": 0.8127430166265118,
553
+ "eval_f1": 0.6453556923883139,
554
+ "eval_loss": 1.3921236991882324,
555
+ "eval_precision": 0.6079980251789682,
556
+ "eval_recall": 0.6876046901172529,
557
+ "eval_runtime": 2.9551,
558
+ "eval_samples_per_second": 376.302,
559
+ "eval_steps_per_second": 11.844,
560
+ "step": 9028
561
+ },
562
+ {
563
+ "epoch": 38.0,
564
+ "eval_accuracy": 0.8099997614560721,
565
+ "eval_f1": 0.6453305351521511,
566
+ "eval_loss": 1.4060559272766113,
567
+ "eval_precision": 0.6086095992083127,
568
+ "eval_recall": 0.6867671691792295,
569
+ "eval_runtime": 2.9714,
570
+ "eval_samples_per_second": 374.231,
571
+ "eval_steps_per_second": 11.779,
572
+ "step": 9272
573
+ },
574
+ {
575
+ "epoch": 38.93,
576
+ "learning_rate": 2.0266393442622952e-05,
577
+ "loss": 0.0088,
578
+ "step": 9500
579
+ },
580
+ {
581
+ "epoch": 39.0,
582
+ "eval_accuracy": 0.8165835738651273,
583
+ "eval_f1": 0.6592208482914507,
584
+ "eval_loss": 1.382816195487976,
585
+ "eval_precision": 0.6293475501396294,
586
+ "eval_recall": 0.692071468453378,
587
+ "eval_runtime": 2.9744,
588
+ "eval_samples_per_second": 373.857,
589
+ "eval_steps_per_second": 11.767,
590
+ "step": 9516
591
+ },
592
+ {
593
+ "epoch": 40.0,
594
+ "eval_accuracy": 0.8129577061615897,
595
+ "eval_f1": 0.6572372769332453,
596
+ "eval_loss": 1.42629873752594,
597
+ "eval_precision": 0.6241526487572182,
598
+ "eval_recall": 0.6940256839754327,
599
+ "eval_runtime": 3.0851,
600
+ "eval_samples_per_second": 360.448,
601
+ "eval_steps_per_second": 11.345,
602
+ "step": 9760
603
+ },
604
+ {
605
+ "epoch": 40.98,
606
+ "learning_rate": 1.975409836065574e-05,
607
+ "loss": 0.0086,
608
+ "step": 10000
609
+ },
610
+ {
611
+ "epoch": 41.0,
612
+ "eval_accuracy": 0.8185157796808282,
613
+ "eval_f1": 0.6573940427765386,
614
+ "eval_loss": 1.3521106243133545,
615
+ "eval_precision": 0.620203020549641,
616
+ "eval_recall": 0.6993299832495813,
617
+ "eval_runtime": 2.9673,
618
+ "eval_samples_per_second": 374.747,
619
+ "eval_steps_per_second": 11.795,
620
+ "step": 10004
621
+ },
622
+ {
623
+ "epoch": 42.0,
624
+ "eval_accuracy": 0.8196369361417906,
625
+ "eval_f1": 0.6713979646491698,
626
+ "eval_loss": 1.372209072113037,
627
+ "eval_precision": 0.6451363870303655,
628
+ "eval_recall": 0.6998883305415968,
629
+ "eval_runtime": 2.9625,
630
+ "eval_samples_per_second": 375.361,
631
+ "eval_steps_per_second": 11.814,
632
+ "step": 10248
633
+ },
634
+ {
635
+ "epoch": 43.0,
636
+ "eval_accuracy": 0.822594880847308,
637
+ "eval_f1": 0.6647074539139727,
638
+ "eval_loss": 1.3783916234970093,
639
+ "eval_precision": 0.6372950819672131,
640
+ "eval_recall": 0.6945840312674484,
641
+ "eval_runtime": 2.9632,
642
+ "eval_samples_per_second": 375.273,
643
+ "eval_steps_per_second": 11.812,
644
+ "step": 10492
645
+ },
646
+ {
647
+ "epoch": 43.03,
648
+ "learning_rate": 1.9241803278688525e-05,
649
+ "loss": 0.0075,
650
+ "step": 10500
651
+ },
652
+ {
653
+ "epoch": 44.0,
654
+ "eval_accuracy": 0.8140072994441927,
655
+ "eval_f1": 0.6623151725056613,
656
+ "eval_loss": 1.433977484703064,
657
+ "eval_precision": 0.6333757961783439,
658
+ "eval_recall": 0.6940256839754327,
659
+ "eval_runtime": 2.9882,
660
+ "eval_samples_per_second": 372.133,
661
+ "eval_steps_per_second": 11.713,
662
+ "step": 10736
663
+ },
664
+ {
665
+ "epoch": 45.0,
666
+ "eval_accuracy": 0.816130340402185,
667
+ "eval_f1": 0.6643708609271523,
668
+ "eval_loss": 1.390194058418274,
669
+ "eval_precision": 0.6320564516129032,
670
+ "eval_recall": 0.7001675041876047,
671
+ "eval_runtime": 2.9941,
672
+ "eval_samples_per_second": 371.401,
673
+ "eval_steps_per_second": 11.69,
674
+ "step": 10980
675
+ },
676
+ {
677
+ "epoch": 45.08,
678
+ "learning_rate": 1.872950819672131e-05,
679
+ "loss": 0.0066,
680
+ "step": 11000
681
+ },
682
+ {
683
+ "epoch": 46.0,
684
+ "eval_accuracy": 0.8162496123661173,
685
+ "eval_f1": 0.6585943669386681,
686
+ "eval_loss": 1.401918888092041,
687
+ "eval_precision": 0.62300796812749,
688
+ "eval_recall": 0.6984924623115578,
689
+ "eval_runtime": 2.9812,
690
+ "eval_samples_per_second": 373.006,
691
+ "eval_steps_per_second": 11.74,
692
+ "step": 11224
693
+ },
694
+ {
695
+ "epoch": 47.0,
696
+ "eval_accuracy": 0.816130340402185,
697
+ "eval_f1": 0.6548463356973996,
698
+ "eval_loss": 1.431990623474121,
699
+ "eval_precision": 0.6183035714285714,
700
+ "eval_recall": 0.6959798994974874,
701
+ "eval_runtime": 2.9274,
702
+ "eval_samples_per_second": 379.864,
703
+ "eval_steps_per_second": 11.956,
704
+ "step": 11468
705
+ },
706
+ {
707
+ "epoch": 47.13,
708
+ "learning_rate": 1.82172131147541e-05,
709
+ "loss": 0.0067,
710
+ "step": 11500
711
+ },
712
+ {
713
+ "epoch": 48.0,
714
+ "eval_accuracy": 0.8200424608191599,
715
+ "eval_f1": 0.6645460569913849,
716
+ "eval_loss": 1.4461051225662231,
717
+ "eval_precision": 0.6326015644713601,
718
+ "eval_recall": 0.6998883305415968,
719
+ "eval_runtime": 3.0764,
720
+ "eval_samples_per_second": 361.458,
721
+ "eval_steps_per_second": 11.377,
722
+ "step": 11712
723
+ },
724
+ {
725
+ "epoch": 49.0,
726
+ "eval_accuracy": 0.8202332959614513,
727
+ "eval_f1": 0.6595464135021096,
728
+ "eval_loss": 1.432677984237671,
729
+ "eval_precision": 0.6249375312343828,
730
+ "eval_recall": 0.69821328866555,
731
+ "eval_runtime": 3.0963,
732
+ "eval_samples_per_second": 359.141,
733
+ "eval_steps_per_second": 11.304,
734
+ "step": 11956
735
+ },
736
+ {
737
+ "epoch": 49.18,
738
+ "learning_rate": 1.7704918032786887e-05,
739
+ "loss": 0.0054,
740
+ "step": 12000
741
+ },
742
+ {
743
+ "epoch": 50.0,
744
+ "eval_accuracy": 0.8176331671477303,
745
+ "eval_f1": 0.6632,
746
+ "eval_loss": 1.4615715742111206,
747
+ "eval_precision": 0.6347626339969372,
748
+ "eval_recall": 0.6943048576214406,
749
+ "eval_runtime": 3.0477,
750
+ "eval_samples_per_second": 364.866,
751
+ "eval_steps_per_second": 11.484,
752
+ "step": 12200
753
+ },
754
+ {
755
+ "epoch": 51.0,
756
+ "eval_accuracy": 0.8177047303260896,
757
+ "eval_f1": 0.6543322475570033,
758
+ "eval_loss": 1.4536991119384766,
759
+ "eval_precision": 0.6134864402638651,
760
+ "eval_recall": 0.7010050251256281,
761
+ "eval_runtime": 2.9579,
762
+ "eval_samples_per_second": 375.937,
763
+ "eval_steps_per_second": 11.833,
764
+ "step": 12444
765
+ },
766
+ {
767
+ "epoch": 51.23,
768
+ "learning_rate": 1.7192622950819673e-05,
769
+ "loss": 0.0052,
770
+ "step": 12500
771
+ },
772
+ {
773
+ "epoch": 52.0,
774
+ "eval_accuracy": 0.8095703823859164,
775
+ "eval_f1": 0.6560255387071029,
776
+ "eval_loss": 1.5621511936187744,
777
+ "eval_precision": 0.6265243902439024,
778
+ "eval_recall": 0.6884422110552764,
779
+ "eval_runtime": 3.0041,
780
+ "eval_samples_per_second": 370.165,
781
+ "eval_steps_per_second": 11.651,
782
+ "step": 12688
783
+ },
784
+ {
785
+ "epoch": 53.0,
786
+ "eval_accuracy": 0.8235967653443381,
787
+ "eval_f1": 0.6670200079501789,
788
+ "eval_loss": 1.4217201471328735,
789
+ "eval_precision": 0.6348045397225726,
790
+ "eval_recall": 0.7026800670016751,
791
+ "eval_runtime": 3.0478,
792
+ "eval_samples_per_second": 364.848,
793
+ "eval_steps_per_second": 11.484,
794
+ "step": 12932
795
+ },
796
+ {
797
+ "epoch": 53.28,
798
+ "learning_rate": 1.668032786885246e-05,
799
+ "loss": 0.0051,
800
+ "step": 13000
801
+ },
802
+ {
803
+ "epoch": 54.0,
804
+ "eval_accuracy": 0.8217361227069965,
805
+ "eval_f1": 0.6597038603913273,
806
+ "eval_loss": 1.4624608755111694,
807
+ "eval_precision": 0.6265695630336514,
808
+ "eval_recall": 0.696538246789503,
809
+ "eval_runtime": 3.0888,
810
+ "eval_samples_per_second": 360.016,
811
+ "eval_steps_per_second": 11.331,
812
+ "step": 13176
813
+ },
814
+ {
815
+ "epoch": 55.0,
816
+ "eval_accuracy": 0.8257198063023306,
817
+ "eval_f1": 0.6645213193885761,
818
+ "eval_loss": 1.4358925819396973,
819
+ "eval_precision": 0.6393188854489165,
820
+ "eval_recall": 0.6917922948073701,
821
+ "eval_runtime": 3.0221,
822
+ "eval_samples_per_second": 367.953,
823
+ "eval_steps_per_second": 11.581,
824
+ "step": 13420
825
+ },
826
+ {
827
+ "epoch": 55.33,
828
+ "learning_rate": 1.6168032786885245e-05,
829
+ "loss": 0.0049,
830
+ "step": 13500
831
+ },
832
+ {
833
+ "epoch": 56.0,
834
+ "eval_accuracy": 0.8230719687030367,
835
+ "eval_f1": 0.6701528559935639,
836
+ "eval_loss": 1.4616944789886475,
837
+ "eval_precision": 0.6447368421052632,
838
+ "eval_recall": 0.6976549413735343,
839
+ "eval_runtime": 3.0092,
840
+ "eval_samples_per_second": 369.528,
841
+ "eval_steps_per_second": 11.631,
842
+ "step": 13664
843
+ },
844
+ {
845
+ "epoch": 57.0,
846
+ "eval_accuracy": 0.8181102550034589,
847
+ "eval_f1": 0.6630275595792837,
848
+ "eval_loss": 1.5170940160751343,
849
+ "eval_precision": 0.6337490455586663,
850
+ "eval_recall": 0.695142378559464,
851
+ "eval_runtime": 2.9576,
852
+ "eval_samples_per_second": 375.979,
853
+ "eval_steps_per_second": 11.834,
854
+ "step": 13908
855
+ },
856
+ {
857
+ "epoch": 57.38,
858
+ "learning_rate": 1.5655737704918035e-05,
859
+ "loss": 0.0037,
860
+ "step": 14000
861
+ },
862
+ {
863
+ "epoch": 58.0,
864
+ "eval_accuracy": 0.8205672574604613,
865
+ "eval_f1": 0.6667548967707781,
866
+ "eval_loss": 1.4998589754104614,
867
+ "eval_precision": 0.6338701560140916,
868
+ "eval_recall": 0.7032384142936907,
869
+ "eval_runtime": 2.9519,
870
+ "eval_samples_per_second": 376.709,
871
+ "eval_steps_per_second": 11.857,
872
+ "step": 14152
873
+ },
874
+ {
875
+ "epoch": 59.0,
876
+ "eval_accuracy": 0.8208296557811121,
877
+ "eval_f1": 0.6617453203269179,
878
+ "eval_loss": 1.484113335609436,
879
+ "eval_precision": 0.6268731268731269,
880
+ "eval_recall": 0.7007258514796203,
881
+ "eval_runtime": 2.9535,
882
+ "eval_samples_per_second": 376.498,
883
+ "eval_steps_per_second": 11.85,
884
+ "step": 14396
885
+ },
886
+ {
887
+ "epoch": 59.43,
888
+ "learning_rate": 1.514344262295082e-05,
889
+ "loss": 0.004,
890
+ "step": 14500
891
+ },
892
+ {
893
+ "epoch": 60.0,
894
+ "eval_accuracy": 0.8244316690918633,
895
+ "eval_f1": 0.6695859872611465,
896
+ "eval_loss": 1.436055302619934,
897
+ "eval_precision": 0.6380880121396054,
898
+ "eval_recall": 0.7043551088777219,
899
+ "eval_runtime": 2.9713,
900
+ "eval_samples_per_second": 374.253,
901
+ "eval_steps_per_second": 11.78,
902
+ "step": 14640
903
+ },
904
+ {
905
+ "epoch": 61.0,
906
+ "eval_accuracy": 0.8235252021659789,
907
+ "eval_f1": 0.6716417910447762,
908
+ "eval_loss": 1.4800474643707275,
909
+ "eval_precision": 0.6425293217746048,
910
+ "eval_recall": 0.7035175879396985,
911
+ "eval_runtime": 2.9663,
912
+ "eval_samples_per_second": 374.881,
913
+ "eval_steps_per_second": 11.799,
914
+ "step": 14884
915
+ },
916
+ {
917
+ "epoch": 61.48,
918
+ "learning_rate": 1.4631147540983607e-05,
919
+ "loss": 0.004,
920
+ "step": 15000
921
+ },
922
+ {
923
+ "epoch": 62.0,
924
+ "eval_accuracy": 0.8240977075928533,
925
+ "eval_f1": 0.664367206155479,
926
+ "eval_loss": 1.4699968099594116,
927
+ "eval_precision": 0.6329625884732053,
928
+ "eval_recall": 0.6990508096035735,
929
+ "eval_runtime": 2.9625,
930
+ "eval_samples_per_second": 375.361,
931
+ "eval_steps_per_second": 11.814,
932
+ "step": 15128
933
+ },
934
+ {
935
+ "epoch": 63.0,
936
+ "eval_accuracy": 0.821211326065695,
937
+ "eval_f1": 0.6643754130865829,
938
+ "eval_loss": 1.5107179880142212,
939
+ "eval_precision": 0.6309314586994728,
940
+ "eval_recall": 0.7015633724176438,
941
+ "eval_runtime": 2.9695,
942
+ "eval_samples_per_second": 374.47,
943
+ "eval_steps_per_second": 11.786,
944
+ "step": 15372
945
+ },
946
+ {
947
+ "epoch": 63.52,
948
+ "learning_rate": 1.4118852459016394e-05,
949
+ "loss": 0.0037,
950
+ "step": 15500
951
+ },
952
+ {
953
+ "epoch": 64.0,
954
+ "eval_accuracy": 0.8227141528112402,
955
+ "eval_f1": 0.6691489361702128,
956
+ "eval_loss": 1.5131914615631104,
957
+ "eval_precision": 0.6389029964448959,
958
+ "eval_recall": 0.7024008933556672,
959
+ "eval_runtime": 2.9514,
960
+ "eval_samples_per_second": 376.769,
961
+ "eval_steps_per_second": 11.859,
962
+ "step": 15616
963
+ },
964
+ {
965
+ "epoch": 65.0,
966
+ "eval_accuracy": 0.8239307268433482,
967
+ "eval_f1": 0.6631481725821349,
968
+ "eval_loss": 1.5229130983352661,
969
+ "eval_precision": 0.6287215411558669,
970
+ "eval_recall": 0.7015633724176438,
971
+ "eval_runtime": 3.0184,
972
+ "eval_samples_per_second": 368.409,
973
+ "eval_steps_per_second": 11.596,
974
+ "step": 15860
975
+ },
976
+ {
977
+ "epoch": 65.57,
978
+ "learning_rate": 1.3606557377049181e-05,
979
+ "loss": 0.0033,
980
+ "step": 16000
981
+ },
982
+ {
983
+ "epoch": 66.0,
984
+ "eval_accuracy": 0.8242408339495718,
985
+ "eval_f1": 0.6695929768555469,
986
+ "eval_loss": 1.5573978424072266,
987
+ "eval_precision": 0.6394817073170732,
988
+ "eval_recall": 0.7026800670016751,
989
+ "eval_runtime": 2.9796,
990
+ "eval_samples_per_second": 373.204,
991
+ "eval_steps_per_second": 11.747,
992
+ "step": 16104
993
+ },
994
+ {
995
+ "epoch": 67.0,
996
+ "eval_accuracy": 0.8195892273562176,
997
+ "eval_f1": 0.6637761135199055,
998
+ "eval_loss": 1.5216217041015625,
999
+ "eval_precision": 0.6269545793000745,
1000
+ "eval_recall": 0.7051926298157454,
1001
+ "eval_runtime": 2.9602,
1002
+ "eval_samples_per_second": 375.655,
1003
+ "eval_steps_per_second": 11.824,
1004
+ "step": 16348
1005
+ },
1006
+ {
1007
+ "epoch": 67.62,
1008
+ "learning_rate": 1.3094262295081968e-05,
1009
+ "loss": 0.0033,
1010
+ "step": 16500
1011
+ },
1012
+ {
1013
+ "epoch": 68.0,
1014
+ "eval_accuracy": 0.8242169795567854,
1015
+ "eval_f1": 0.6635576282478347,
1016
+ "eval_loss": 1.4876649379730225,
1017
+ "eval_precision": 0.6347183278103492,
1018
+ "eval_recall": 0.695142378559464,
1019
+ "eval_runtime": 2.9571,
1020
+ "eval_samples_per_second": 376.04,
1021
+ "eval_steps_per_second": 11.836,
1022
+ "step": 16592
1023
+ },
1024
+ {
1025
+ "epoch": 69.0,
1026
+ "eval_accuracy": 0.8194699553922855,
1027
+ "eval_f1": 0.6630635380964935,
1028
+ "eval_loss": 1.5372997522354126,
1029
+ "eval_precision": 0.6281218781218781,
1030
+ "eval_recall": 0.7021217197096594,
1031
+ "eval_runtime": 3.0929,
1032
+ "eval_samples_per_second": 359.536,
1033
+ "eval_steps_per_second": 11.316,
1034
+ "step": 16836
1035
+ },
1036
+ {
1037
+ "epoch": 69.67,
1038
+ "learning_rate": 1.2581967213114756e-05,
1039
+ "loss": 0.0026,
1040
+ "step": 17000
1041
+ },
1042
+ {
1043
+ "epoch": 70.0,
1044
+ "eval_accuracy": 0.8200663152119463,
1045
+ "eval_f1": 0.6651637713831056,
1046
+ "eval_loss": 1.5521858930587769,
1047
+ "eval_precision": 0.6334933063905026,
1048
+ "eval_recall": 0.7001675041876047,
1049
+ "eval_runtime": 3.0082,
1050
+ "eval_samples_per_second": 369.652,
1051
+ "eval_steps_per_second": 11.635,
1052
+ "step": 17080
1053
+ },
1054
+ {
1055
+ "epoch": 71.0,
1056
+ "eval_accuracy": 0.8226902984184538,
1057
+ "eval_f1": 0.6691449814126393,
1058
+ "eval_loss": 1.5180128812789917,
1059
+ "eval_precision": 0.6379746835443038,
1060
+ "eval_recall": 0.7035175879396985,
1061
+ "eval_runtime": 2.9978,
1062
+ "eval_samples_per_second": 370.939,
1063
+ "eval_steps_per_second": 11.675,
1064
+ "step": 17324
1065
+ },
1066
+ {
1067
+ "epoch": 71.72,
1068
+ "learning_rate": 1.206967213114754e-05,
1069
+ "loss": 0.0024,
1070
+ "step": 17500
1071
+ },
1072
+ {
1073
+ "epoch": 72.0,
1074
+ "eval_accuracy": 0.8218076858853558,
1075
+ "eval_f1": 0.67304324397144,
1076
+ "eval_loss": 1.5517120361328125,
1077
+ "eval_precision": 0.6503514709711012,
1078
+ "eval_recall": 0.6973757677275265,
1079
+ "eval_runtime": 2.9567,
1080
+ "eval_samples_per_second": 376.089,
1081
+ "eval_steps_per_second": 11.837,
1082
+ "step": 17568
1083
+ },
1084
+ {
1085
+ "epoch": 73.0,
1086
+ "eval_accuracy": 0.8206388206388207,
1087
+ "eval_f1": 0.6658723854911305,
1088
+ "eval_loss": 1.539225697517395,
1089
+ "eval_precision": 0.6331822759315207,
1090
+ "eval_recall": 0.7021217197096594,
1091
+ "eval_runtime": 2.9669,
1092
+ "eval_samples_per_second": 374.796,
1093
+ "eval_steps_per_second": 11.797,
1094
+ "step": 17812
1095
+ },
1096
+ {
1097
+ "epoch": 73.77,
1098
+ "learning_rate": 1.1557377049180328e-05,
1099
+ "loss": 0.0026,
1100
+ "step": 18000
1101
+ },
1102
+ {
1103
+ "epoch": 74.0,
1104
+ "eval_accuracy": 0.8245986498413683,
1105
+ "eval_f1": 0.669957310565635,
1106
+ "eval_loss": 1.5395687818527222,
1107
+ "eval_precision": 0.641543178334185,
1108
+ "eval_recall": 0.7010050251256281,
1109
+ "eval_runtime": 2.9773,
1110
+ "eval_samples_per_second": 373.49,
1111
+ "eval_steps_per_second": 11.756,
1112
+ "step": 18056
1113
+ },
1114
+ {
1115
+ "epoch": 75.0,
1116
+ "eval_accuracy": 0.8233343670236875,
1117
+ "eval_f1": 0.6740153246404087,
1118
+ "eval_loss": 1.5637731552124023,
1119
+ "eval_precision": 0.6499870365569095,
1120
+ "eval_recall": 0.6998883305415968,
1121
+ "eval_runtime": 2.9662,
1122
+ "eval_samples_per_second": 374.894,
1123
+ "eval_steps_per_second": 11.8,
1124
+ "step": 18300
1125
+ },
1126
+ {
1127
+ "epoch": 75.82,
1128
+ "learning_rate": 1.1045081967213114e-05,
1129
+ "loss": 0.0019,
1130
+ "step": 18500
1131
+ },
1132
+ {
1133
+ "epoch": 76.0,
1134
+ "eval_accuracy": 0.8201855871758784,
1135
+ "eval_f1": 0.6666666666666666,
1136
+ "eval_loss": 1.5789735317230225,
1137
+ "eval_precision": 0.6437857514300572,
1138
+ "eval_recall": 0.6912339475153545,
1139
+ "eval_runtime": 2.9784,
1140
+ "eval_samples_per_second": 373.357,
1141
+ "eval_steps_per_second": 11.751,
1142
+ "step": 18544
1143
+ },
1144
+ {
1145
+ "epoch": 77.0,
1146
+ "eval_accuracy": 0.8216168507430643,
1147
+ "eval_f1": 0.676486341724692,
1148
+ "eval_loss": 1.5545753240585327,
1149
+ "eval_precision": 0.6500257334019557,
1150
+ "eval_recall": 0.7051926298157454,
1151
+ "eval_runtime": 2.9887,
1152
+ "eval_samples_per_second": 372.065,
1153
+ "eval_steps_per_second": 11.711,
1154
+ "step": 18788
1155
+ },
1156
+ {
1157
+ "epoch": 77.87,
1158
+ "learning_rate": 1.0532786885245902e-05,
1159
+ "loss": 0.0029,
1160
+ "step": 19000
1161
+ },
1162
+ {
1163
+ "epoch": 78.0,
1164
+ "eval_accuracy": 0.8236206197371246,
1165
+ "eval_f1": 0.6684357171288311,
1166
+ "eval_loss": 1.5374187231063843,
1167
+ "eval_precision": 0.6369152970922882,
1168
+ "eval_recall": 0.7032384142936907,
1169
+ "eval_runtime": 2.9659,
1170
+ "eval_samples_per_second": 374.933,
1171
+ "eval_steps_per_second": 11.801,
1172
+ "step": 19032
1173
+ },
1174
+ {
1175
+ "epoch": 79.0,
1176
+ "eval_accuracy": 0.8180148374323132,
1177
+ "eval_f1": 0.6651595744680852,
1178
+ "eval_loss": 1.5923025608062744,
1179
+ "eval_precision": 0.6350939563230066,
1180
+ "eval_recall": 0.69821328866555,
1181
+ "eval_runtime": 2.9684,
1182
+ "eval_samples_per_second": 374.615,
1183
+ "eval_steps_per_second": 11.791,
1184
+ "step": 19276
1185
+ },
1186
+ {
1187
+ "epoch": 79.92,
1188
+ "learning_rate": 1.0020491803278688e-05,
1189
+ "loss": 0.0015,
1190
+ "step": 19500
1191
+ },
1192
+ {
1193
+ "epoch": 80.0,
1194
+ "eval_accuracy": 0.8245986498413683,
1195
+ "eval_f1": 0.6673737239825004,
1196
+ "eval_loss": 1.5727756023406982,
1197
+ "eval_precision": 0.6354455945468316,
1198
+ "eval_recall": 0.7026800670016751,
1199
+ "eval_runtime": 2.9791,
1200
+ "eval_samples_per_second": 373.261,
1201
+ "eval_steps_per_second": 11.748,
1202
+ "step": 19520
1203
+ },
1204
+ {
1205
+ "epoch": 81.0,
1206
+ "eval_accuracy": 0.8228811335607452,
1207
+ "eval_f1": 0.6686279753944906,
1208
+ "eval_loss": 1.564627766609192,
1209
+ "eval_precision": 0.6416837782340863,
1210
+ "eval_recall": 0.6979341150195422,
1211
+ "eval_runtime": 2.9678,
1212
+ "eval_samples_per_second": 374.693,
1213
+ "eval_steps_per_second": 11.793,
1214
+ "step": 19764
1215
+ },
1216
+ {
1217
+ "epoch": 81.97,
1218
+ "learning_rate": 9.508196721311476e-06,
1219
+ "loss": 0.0019,
1220
+ "step": 20000
1221
+ },
1222
+ {
1223
+ "epoch": 82.0,
1224
+ "eval_accuracy": 0.8210920541017629,
1225
+ "eval_f1": 0.6615000656771313,
1226
+ "eval_loss": 1.5844708681106567,
1227
+ "eval_precision": 0.6246588935747953,
1228
+ "eval_recall": 0.7029592406476829,
1229
+ "eval_runtime": 2.9641,
1230
+ "eval_samples_per_second": 375.15,
1231
+ "eval_steps_per_second": 11.808,
1232
+ "step": 20008
1233
+ },
1234
+ {
1235
+ "epoch": 83.0,
1236
+ "eval_accuracy": 0.819279120249994,
1237
+ "eval_f1": 0.666935159081756,
1238
+ "eval_loss": 1.589419960975647,
1239
+ "eval_precision": 0.6423584173778123,
1240
+ "eval_recall": 0.6934673366834171,
1241
+ "eval_runtime": 2.9637,
1242
+ "eval_samples_per_second": 375.21,
1243
+ "eval_steps_per_second": 11.81,
1244
+ "step": 20252
1245
+ },
1246
+ {
1247
+ "epoch": 84.0,
1248
+ "eval_accuracy": 0.8169175353641374,
1249
+ "eval_f1": 0.665859238325932,
1250
+ "eval_loss": 1.6702436208724976,
1251
+ "eval_precision": 0.6427643543777605,
1252
+ "eval_recall": 0.690675600223339,
1253
+ "eval_runtime": 2.99,
1254
+ "eval_samples_per_second": 371.906,
1255
+ "eval_steps_per_second": 11.706,
1256
+ "step": 20496
1257
+ },
1258
+ {
1259
+ "epoch": 84.02,
1260
+ "learning_rate": 8.995901639344264e-06,
1261
+ "loss": 0.0012,
1262
+ "step": 20500
1263
+ },
1264
+ {
1265
+ "epoch": 85.0,
1266
+ "eval_accuracy": 0.8188974499654111,
1267
+ "eval_f1": 0.6666666666666667,
1268
+ "eval_loss": 1.6313210725784302,
1269
+ "eval_precision": 0.6341647770219199,
1270
+ "eval_recall": 0.7026800670016751,
1271
+ "eval_runtime": 3.0786,
1272
+ "eval_samples_per_second": 361.198,
1273
+ "eval_steps_per_second": 11.369,
1274
+ "step": 20740
1275
+ },
1276
+ {
1277
+ "epoch": 86.0,
1278
+ "eval_accuracy": 0.8231912406669688,
1279
+ "eval_f1": 0.6679915209326974,
1280
+ "eval_loss": 1.5829322338104248,
1281
+ "eval_precision": 0.6356530509329299,
1282
+ "eval_recall": 0.7037967615857063,
1283
+ "eval_runtime": 2.9543,
1284
+ "eval_samples_per_second": 376.397,
1285
+ "eval_steps_per_second": 11.847,
1286
+ "step": 20984
1287
+ },
1288
+ {
1289
+ "epoch": 86.07,
1290
+ "learning_rate": 8.483606557377049e-06,
1291
+ "loss": 0.0015,
1292
+ "step": 21000
1293
+ },
1294
+ {
1295
+ "epoch": 87.0,
1296
+ "eval_accuracy": 0.8209966365306172,
1297
+ "eval_f1": 0.672317880794702,
1298
+ "eval_loss": 1.605576753616333,
1299
+ "eval_precision": 0.639616935483871,
1300
+ "eval_recall": 0.7085427135678392,
1301
+ "eval_runtime": 2.9894,
1302
+ "eval_samples_per_second": 371.983,
1303
+ "eval_steps_per_second": 11.708,
1304
+ "step": 21228
1305
+ },
1306
+ {
1307
+ "epoch": 88.0,
1308
+ "eval_accuracy": 0.8224994632761623,
1309
+ "eval_f1": 0.6773120425815037,
1310
+ "eval_loss": 1.5823140144348145,
1311
+ "eval_precision": 0.6470887363335875,
1312
+ "eval_recall": 0.7104969290898939,
1313
+ "eval_runtime": 2.973,
1314
+ "eval_samples_per_second": 374.028,
1315
+ "eval_steps_per_second": 11.772,
1316
+ "step": 21472
1317
+ },
1318
+ {
1319
+ "epoch": 88.11,
1320
+ "learning_rate": 7.971311475409837e-06,
1321
+ "loss": 0.0015,
1322
+ "step": 21500
1323
+ },
1324
+ {
1325
+ "epoch": 89.0,
1326
+ "eval_accuracy": 0.8249326113403783,
1327
+ "eval_f1": 0.6675521317572054,
1328
+ "eval_loss": 1.573556661605835,
1329
+ "eval_precision": 0.6366860907017988,
1330
+ "eval_recall": 0.7015633724176438,
1331
+ "eval_runtime": 2.9619,
1332
+ "eval_samples_per_second": 375.438,
1333
+ "eval_steps_per_second": 11.817,
1334
+ "step": 21716
1335
+ },
1336
+ {
1337
+ "epoch": 90.0,
1338
+ "eval_accuracy": 0.823644474129911,
1339
+ "eval_f1": 0.6724645437516724,
1340
+ "eval_loss": 1.5920721292495728,
1341
+ "eval_precision": 0.64568345323741,
1342
+ "eval_recall": 0.7015633724176438,
1343
+ "eval_runtime": 2.9692,
1344
+ "eval_samples_per_second": 374.506,
1345
+ "eval_steps_per_second": 11.788,
1346
+ "step": 21960
1347
+ },
1348
+ {
1349
+ "epoch": 90.16,
1350
+ "learning_rate": 7.459016393442623e-06,
1351
+ "loss": 0.0012,
1352
+ "step": 22000
1353
+ },
1354
+ {
1355
+ "epoch": 91.0,
1356
+ "eval_accuracy": 0.8230958230958231,
1357
+ "eval_f1": 0.6684364215556146,
1358
+ "eval_loss": 1.6113594770431519,
1359
+ "eval_precision": 0.6371457489878543,
1360
+ "eval_recall": 0.7029592406476829,
1361
+ "eval_runtime": 2.9681,
1362
+ "eval_samples_per_second": 374.651,
1363
+ "eval_steps_per_second": 11.792,
1364
+ "step": 22204
1365
+ },
1366
+ {
1367
+ "epoch": 92.0,
1368
+ "eval_accuracy": 0.824527086663009,
1369
+ "eval_f1": 0.6708355508249069,
1370
+ "eval_loss": 1.5752336978912354,
1371
+ "eval_precision": 0.6408235892221658,
1372
+ "eval_recall": 0.7037967615857063,
1373
+ "eval_runtime": 2.9615,
1374
+ "eval_samples_per_second": 375.482,
1375
+ "eval_steps_per_second": 11.818,
1376
+ "step": 22448
1377
+ },
1378
+ {
1379
+ "epoch": 92.21,
1380
+ "learning_rate": 6.946721311475411e-06,
1381
+ "loss": 0.0014,
1382
+ "step": 22500
1383
+ },
1384
+ {
1385
+ "epoch": 93.0,
1386
+ "eval_accuracy": 0.8216884139214237,
1387
+ "eval_f1": 0.6672859986728599,
1388
+ "eval_loss": 1.6123404502868652,
1389
+ "eval_precision": 0.6359726789779914,
1390
+ "eval_recall": 0.7018425460636516,
1391
+ "eval_runtime": 3.0214,
1392
+ "eval_samples_per_second": 368.046,
1393
+ "eval_steps_per_second": 11.584,
1394
+ "step": 22692
1395
+ },
1396
+ {
1397
+ "epoch": 94.0,
1398
+ "eval_accuracy": 0.8221177929915794,
1399
+ "eval_f1": 0.6681721572794899,
1400
+ "eval_loss": 1.618289589881897,
1401
+ "eval_precision": 0.6373542828180436,
1402
+ "eval_recall": 0.7021217197096594,
1403
+ "eval_runtime": 2.9574,
1404
+ "eval_samples_per_second": 376.01,
1405
+ "eval_steps_per_second": 11.835,
1406
+ "step": 22936
1407
+ },
1408
+ {
1409
+ "epoch": 94.26,
1410
+ "learning_rate": 6.434426229508197e-06,
1411
+ "loss": 0.0009,
1412
+ "step": 23000
1413
+ },
1414
+ {
1415
+ "epoch": 95.0,
1416
+ "eval_accuracy": 0.8274611769757401,
1417
+ "eval_f1": 0.6721267454350162,
1418
+ "eval_loss": 1.6077880859375,
1419
+ "eval_precision": 0.6474392136575272,
1420
+ "eval_recall": 0.6987716359575656,
1421
+ "eval_runtime": 2.9598,
1422
+ "eval_samples_per_second": 375.703,
1423
+ "eval_steps_per_second": 11.825,
1424
+ "step": 23180
1425
+ },
1426
+ {
1427
+ "epoch": 96.0,
1428
+ "eval_accuracy": 0.8245747954485818,
1429
+ "eval_f1": 0.6682679476914866,
1430
+ "eval_loss": 1.6201205253601074,
1431
+ "eval_precision": 0.6400817995910021,
1432
+ "eval_recall": 0.6990508096035735,
1433
+ "eval_runtime": 2.9623,
1434
+ "eval_samples_per_second": 375.385,
1435
+ "eval_steps_per_second": 11.815,
1436
+ "step": 23424
1437
+ },
1438
+ {
1439
+ "epoch": 96.31,
1440
+ "learning_rate": 5.922131147540984e-06,
1441
+ "loss": 0.0008,
1442
+ "step": 23500
1443
+ },
1444
+ {
1445
+ "epoch": 97.0,
1446
+ "eval_accuracy": 0.8238114548794161,
1447
+ "eval_f1": 0.6687067589143161,
1448
+ "eval_loss": 1.6216107606887817,
1449
+ "eval_precision": 0.6387900355871886,
1450
+ "eval_recall": 0.7015633724176438,
1451
+ "eval_runtime": 2.9859,
1452
+ "eval_samples_per_second": 372.411,
1453
+ "eval_steps_per_second": 11.722,
1454
+ "step": 23668
1455
+ },
1456
+ {
1457
+ "epoch": 98.0,
1458
+ "eval_accuracy": 0.8243839603062904,
1459
+ "eval_f1": 0.6703077128013853,
1460
+ "eval_loss": 1.6113009452819824,
1461
+ "eval_precision": 0.6410191082802548,
1462
+ "eval_recall": 0.7024008933556672,
1463
+ "eval_runtime": 2.9619,
1464
+ "eval_samples_per_second": 375.438,
1465
+ "eval_steps_per_second": 11.817,
1466
+ "step": 23912
1467
+ },
1468
+ {
1469
+ "epoch": 98.36,
1470
+ "learning_rate": 5.409836065573771e-06,
1471
+ "loss": 0.0011,
1472
+ "step": 24000
1473
+ },
1474
+ {
1475
+ "epoch": 99.0,
1476
+ "eval_accuracy": 0.824527086663009,
1477
+ "eval_f1": 0.6751609442060086,
1478
+ "eval_loss": 1.5995452404022217,
1479
+ "eval_precision": 0.6497160557563242,
1480
+ "eval_recall": 0.7026800670016751,
1481
+ "eval_runtime": 3.0743,
1482
+ "eval_samples_per_second": 361.705,
1483
+ "eval_steps_per_second": 11.385,
1484
+ "step": 24156
1485
+ },
1486
+ {
1487
+ "epoch": 100.0,
1488
+ "eval_accuracy": 0.8258629326590492,
1489
+ "eval_f1": 0.6711105185975204,
1490
+ "eval_loss": 1.5953351259231567,
1491
+ "eval_precision": 0.642255677468742,
1492
+ "eval_recall": 0.7026800670016751,
1493
+ "eval_runtime": 2.974,
1494
+ "eval_samples_per_second": 373.912,
1495
+ "eval_steps_per_second": 11.769,
1496
+ "step": 24400
1497
+ },
1498
+ {
1499
+ "epoch": 100.41,
1500
+ "learning_rate": 4.897540983606557e-06,
1501
+ "loss": 0.0009,
1502
+ "step": 24500
1503
+ },
1504
+ {
1505
+ "epoch": 101.0,
1506
+ "eval_accuracy": 0.8247894849836598,
1507
+ "eval_f1": 0.6724552497996259,
1508
+ "eval_loss": 1.6178245544433594,
1509
+ "eval_precision": 0.6447233606557377,
1510
+ "eval_recall": 0.7026800670016751,
1511
+ "eval_runtime": 3.0886,
1512
+ "eval_samples_per_second": 360.031,
1513
+ "eval_steps_per_second": 11.332,
1514
+ "step": 24644
1515
+ },
1516
+ {
1517
+ "epoch": 102.0,
1518
+ "eval_accuracy": 0.8256720975167577,
1519
+ "eval_f1": 0.67206585236325,
1520
+ "eval_loss": 1.6170806884765625,
1521
+ "eval_precision": 0.640759493670886,
1522
+ "eval_recall": 0.7065884980457845,
1523
+ "eval_runtime": 2.967,
1524
+ "eval_samples_per_second": 374.794,
1525
+ "eval_steps_per_second": 11.797,
1526
+ "step": 24888
1527
+ },
1528
+ {
1529
+ "epoch": 102.46,
1530
+ "learning_rate": 4.385245901639344e-06,
1531
+ "loss": 0.0006,
1532
+ "step": 25000
1533
+ },
1534
+ {
1535
+ "epoch": 103.0,
1536
+ "eval_accuracy": 0.8270795066911572,
1537
+ "eval_f1": 0.6780794436271232,
1538
+ "eval_loss": 1.6054375171661377,
1539
+ "eval_precision": 0.6508344030808729,
1540
+ "eval_recall": 0.7077051926298158,
1541
+ "eval_runtime": 2.9599,
1542
+ "eval_samples_per_second": 375.683,
1543
+ "eval_steps_per_second": 11.825,
1544
+ "step": 25132
1545
+ },
1546
+ {
1547
+ "epoch": 104.0,
1548
+ "eval_accuracy": 0.8251234464826698,
1549
+ "eval_f1": 0.6701319472211115,
1550
+ "eval_loss": 1.621781826019287,
1551
+ "eval_precision": 0.6411629686304514,
1552
+ "eval_recall": 0.7018425460636516,
1553
+ "eval_runtime": 3.0743,
1554
+ "eval_samples_per_second": 361.711,
1555
+ "eval_steps_per_second": 11.385,
1556
+ "step": 25376
1557
+ },
1558
+ {
1559
+ "epoch": 104.51,
1560
+ "learning_rate": 3.872950819672131e-06,
1561
+ "loss": 0.0008,
1562
+ "step": 25500
1563
+ },
1564
+ {
1565
+ "epoch": 105.0,
1566
+ "eval_accuracy": 0.8244555234846497,
1567
+ "eval_f1": 0.6738082485270488,
1568
+ "eval_loss": 1.6307542324066162,
1569
+ "eval_precision": 0.6474523932063819,
1570
+ "eval_recall": 0.7024008933556672,
1571
+ "eval_runtime": 2.9786,
1572
+ "eval_samples_per_second": 373.327,
1573
+ "eval_steps_per_second": 11.75,
1574
+ "step": 25620
1575
+ },
1576
+ {
1577
+ "epoch": 106.0,
1578
+ "eval_accuracy": 0.8267216907993608,
1579
+ "eval_f1": 0.6755638596022955,
1580
+ "eval_loss": 1.6341726779937744,
1581
+ "eval_precision": 0.6471490667348504,
1582
+ "eval_recall": 0.7065884980457845,
1583
+ "eval_runtime": 2.9836,
1584
+ "eval_samples_per_second": 372.709,
1585
+ "eval_steps_per_second": 11.731,
1586
+ "step": 25864
1587
+ },
1588
+ {
1589
+ "epoch": 106.56,
1590
+ "learning_rate": 3.3606557377049183e-06,
1591
+ "loss": 0.0004,
1592
+ "step": 26000
1593
+ },
1594
+ {
1595
+ "epoch": 107.0,
1596
+ "eval_accuracy": 0.8253619904105342,
1597
+ "eval_f1": 0.673863787818206,
1598
+ "eval_loss": 1.634595274925232,
1599
+ "eval_precision": 0.6447334863555215,
1600
+ "eval_recall": 0.705750977107761,
1601
+ "eval_runtime": 2.9821,
1602
+ "eval_samples_per_second": 372.893,
1603
+ "eval_steps_per_second": 11.737,
1604
+ "step": 26108
1605
+ },
1606
+ {
1607
+ "epoch": 108.0,
1608
+ "eval_accuracy": 0.825743660695117,
1609
+ "eval_f1": 0.6736758051636944,
1610
+ "eval_loss": 1.6328423023223877,
1611
+ "eval_precision": 0.6436927772126144,
1612
+ "eval_recall": 0.7065884980457845,
1613
+ "eval_runtime": 2.998,
1614
+ "eval_samples_per_second": 370.912,
1615
+ "eval_steps_per_second": 11.674,
1616
+ "step": 26352
1617
+ },
1618
+ {
1619
+ "epoch": 108.61,
1620
+ "learning_rate": 2.848360655737705e-06,
1621
+ "loss": 0.0008,
1622
+ "step": 26500
1623
+ },
1624
+ {
1625
+ "epoch": 109.0,
1626
+ "eval_accuracy": 0.8256720975167577,
1627
+ "eval_f1": 0.674515050167224,
1628
+ "eval_loss": 1.6220307350158691,
1629
+ "eval_precision": 0.6475725661443616,
1630
+ "eval_recall": 0.7037967615857063,
1631
+ "eval_runtime": 2.9865,
1632
+ "eval_samples_per_second": 372.341,
1633
+ "eval_steps_per_second": 11.719,
1634
+ "step": 26596
1635
+ },
1636
+ {
1637
+ "epoch": 110.0,
1638
+ "eval_accuracy": 0.8276281577252451,
1639
+ "eval_f1": 0.6784613322610911,
1640
+ "eval_loss": 1.6160385608673096,
1641
+ "eval_precision": 0.6524877545759217,
1642
+ "eval_recall": 0.7065884980457845,
1643
+ "eval_runtime": 2.9726,
1644
+ "eval_samples_per_second": 374.081,
1645
+ "eval_steps_per_second": 11.774,
1646
+ "step": 26840
1647
+ },
1648
+ {
1649
+ "epoch": 110.66,
1650
+ "learning_rate": 2.336065573770492e-06,
1651
+ "loss": 0.0006,
1652
+ "step": 27000
1653
+ },
1654
+ {
1655
+ "epoch": 111.0,
1656
+ "eval_accuracy": 0.8270079435127978,
1657
+ "eval_f1": 0.6741363211951447,
1658
+ "eval_loss": 1.609981656074524,
1659
+ "eval_precision": 0.6454661558109834,
1660
+ "eval_recall": 0.7054718034617532,
1661
+ "eval_runtime": 2.9645,
1662
+ "eval_samples_per_second": 375.104,
1663
+ "eval_steps_per_second": 11.806,
1664
+ "step": 27084
1665
+ },
1666
+ {
1667
+ "epoch": 112.0,
1668
+ "eval_accuracy": 0.8246702130197275,
1669
+ "eval_f1": 0.6708255906556942,
1670
+ "eval_loss": 1.6269794702529907,
1671
+ "eval_precision": 0.6394230769230769,
1672
+ "eval_recall": 0.7054718034617532,
1673
+ "eval_runtime": 2.9668,
1674
+ "eval_samples_per_second": 374.813,
1675
+ "eval_steps_per_second": 11.797,
1676
+ "step": 27328
1677
+ },
1678
+ {
1679
+ "epoch": 112.7,
1680
+ "learning_rate": 1.8237704918032786e-06,
1681
+ "loss": 0.0005,
1682
+ "step": 27500
1683
+ },
1684
+ {
1685
+ "epoch": 113.0,
1686
+ "eval_accuracy": 0.8273180506190215,
1687
+ "eval_f1": 0.6754362416107383,
1688
+ "eval_loss": 1.6233818531036377,
1689
+ "eval_precision": 0.6504653567735263,
1690
+ "eval_recall": 0.7024008933556672,
1691
+ "eval_runtime": 2.9684,
1692
+ "eval_samples_per_second": 374.612,
1693
+ "eval_steps_per_second": 11.791,
1694
+ "step": 27572
1695
+ },
1696
+ {
1697
+ "epoch": 114.0,
1698
+ "eval_accuracy": 0.8252188640538155,
1699
+ "eval_f1": 0.6711945665201757,
1700
+ "eval_loss": 1.632752537727356,
1701
+ "eval_precision": 0.6417112299465241,
1702
+ "eval_recall": 0.7035175879396985,
1703
+ "eval_runtime": 2.9514,
1704
+ "eval_samples_per_second": 376.771,
1705
+ "eval_steps_per_second": 11.859,
1706
+ "step": 27816
1707
+ },
1708
+ {
1709
+ "epoch": 114.75,
1710
+ "learning_rate": 1.3114754098360657e-06,
1711
+ "loss": 0.0004,
1712
+ "step": 28000
1713
+ },
1714
+ {
1715
+ "epoch": 115.0,
1716
+ "eval_accuracy": 0.8251473008754562,
1717
+ "eval_f1": 0.6710262912051248,
1718
+ "eval_loss": 1.635224461555481,
1719
+ "eval_precision": 0.6428023523395551,
1720
+ "eval_recall": 0.7018425460636516,
1721
+ "eval_runtime": 2.9586,
1722
+ "eval_samples_per_second": 375.859,
1723
+ "eval_steps_per_second": 11.83,
1724
+ "step": 28060
1725
+ },
1726
+ {
1727
+ "epoch": 116.0,
1728
+ "eval_accuracy": 0.8265308556570693,
1729
+ "eval_f1": 0.6743162108072048,
1730
+ "eval_loss": 1.6268539428710938,
1731
+ "eval_precision": 0.6457960644007156,
1732
+ "eval_recall": 0.7054718034617532,
1733
+ "eval_runtime": 3.1118,
1734
+ "eval_samples_per_second": 357.346,
1735
+ "eval_steps_per_second": 11.247,
1736
+ "step": 28304
1737
+ },
1738
+ {
1739
+ "epoch": 116.8,
1740
+ "learning_rate": 7.991803278688524e-07,
1741
+ "loss": 0.0005,
1742
+ "step": 28500
1743
+ },
1744
+ {
1745
+ "epoch": 117.0,
1746
+ "eval_accuracy": 0.8253381360177476,
1747
+ "eval_f1": 0.6728024543150594,
1748
+ "eval_loss": 1.6376687288284302,
1749
+ "eval_precision": 0.6441890166028097,
1750
+ "eval_recall": 0.7040759352317141,
1751
+ "eval_runtime": 2.9567,
1752
+ "eval_samples_per_second": 376.091,
1753
+ "eval_steps_per_second": 11.837,
1754
+ "step": 28548
1755
+ },
1756
+ {
1757
+ "epoch": 118.0,
1758
+ "eval_accuracy": 0.8256959519095441,
1759
+ "eval_f1": 0.6736027744431106,
1760
+ "eval_loss": 1.6352702379226685,
1761
+ "eval_precision": 0.644955300127714,
1762
+ "eval_recall": 0.7049134561697376,
1763
+ "eval_runtime": 2.977,
1764
+ "eval_samples_per_second": 373.528,
1765
+ "eval_steps_per_second": 11.757,
1766
+ "step": 28792
1767
+ },
1768
+ {
1769
+ "epoch": 118.85,
1770
+ "learning_rate": 2.8688524590163937e-07,
1771
+ "loss": 0.0004,
1772
+ "step": 29000
1773
+ },
1774
+ {
1775
+ "epoch": 119.0,
1776
+ "eval_accuracy": 0.825743660695117,
1777
+ "eval_f1": 0.6747793527681197,
1778
+ "eval_loss": 1.6394999027252197,
1779
+ "eval_precision": 0.6475872689938398,
1780
+ "eval_recall": 0.7043551088777219,
1781
+ "eval_runtime": 2.9575,
1782
+ "eval_samples_per_second": 375.995,
1783
+ "eval_steps_per_second": 11.834,
1784
+ "step": 29036
1785
+ },
1786
+ {
1787
+ "epoch": 120.0,
1788
+ "eval_accuracy": 0.8256005343383984,
1789
+ "eval_f1": 0.6740641711229947,
1790
+ "eval_loss": 1.6384611129760742,
1791
+ "eval_precision": 0.6467419189327861,
1792
+ "eval_recall": 0.7037967615857063,
1793
+ "eval_runtime": 2.9942,
1794
+ "eval_samples_per_second": 371.386,
1795
+ "eval_steps_per_second": 11.689,
1796
+ "step": 29280
1797
+ },
1798
+ {
1799
+ "epoch": 120.0,
1800
+ "step": 29280,
1801
+ "total_flos": 1.220726808511488e+17,
1802
+ "train_loss": 0.045685164459416124,
1803
+ "train_runtime": 6878.0479,
1804
+ "train_samples_per_second": 135.823,
1805
+ "train_steps_per_second": 4.257
1806
+ }
1807
+ ],
1808
+ "max_steps": 29280,
1809
+ "num_train_epochs": 120,
1810
+ "total_flos": 1.220726808511488e+17,
1811
+ "trial_name": null,
1812
+ "trial_params": null
1813
+ }