makhataei commited on
Commit
dd450ad
1 Parent(s): c37a5d3

End of training

Browse files
README.md CHANGED
@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [makhataei/qa-persian-mdeberta-v3-base-squad2](https://huggingface.co/makhataei/qa-persian-mdeberta-v3-base-squad2) on the pquad dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 2.4944
21
 
22
  ## Model description
23
 
@@ -36,7 +36,7 @@ More information needed
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
- - learning_rate: 2.5e-05
40
  - train_batch_size: 5
41
  - eval_batch_size: 5
42
  - seed: 42
@@ -48,262 +48,262 @@ The following hyperparameters were used during training:
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:------:|:---------------:|
51
- | 0.4273 | 0.04 | 500 | 1.2636 |
52
- | 0.3813 | 0.08 | 1000 | 1.2619 |
53
- | 0.3933 | 0.12 | 1500 | 1.3319 |
54
- | 0.3305 | 0.16 | 2000 | 1.4050 |
55
- | 0.3604 | 0.19 | 2500 | 1.4107 |
56
- | 0.3431 | 0.23 | 3000 | 1.3068 |
57
- | 0.3258 | 0.27 | 3500 | 1.3487 |
58
- | 0.3432 | 0.31 | 4000 | 1.4339 |
59
- | 0.3429 | 0.35 | 4500 | 1.3738 |
60
- | 0.3279 | 0.39 | 5000 | 1.4084 |
61
- | 0.3268 | 0.43 | 5500 | 1.3671 |
62
- | 0.3352 | 0.47 | 6000 | 1.3462 |
63
- | 0.3233 | 0.51 | 6500 | 1.3703 |
64
- | 0.3157 | 0.55 | 7000 | 1.4630 |
65
- | 0.3005 | 0.58 | 7500 | 1.4817 |
66
- | 0.2708 | 0.62 | 8000 | 1.4972 |
67
- | 0.3227 | 0.66 | 8500 | 1.4029 |
68
- | 0.6272 | 0.7 | 9000 | 1.0431 |
69
- | 0.5573 | 0.74 | 9500 | 1.0920 |
70
- | 0.5744 | 0.78 | 10000 | 1.0445 |
71
- | 0.5128 | 0.82 | 10500 | 1.0858 |
72
- | 0.5503 | 0.86 | 11000 | 1.0169 |
73
- | 0.5128 | 0.9 | 11500 | 1.0771 |
74
- | 0.5281 | 0.94 | 12000 | 1.0501 |
75
- | 0.5347 | 0.97 | 12500 | 1.0867 |
76
- | 0.4619 | 1.01 | 13000 | 1.2511 |
77
- | 0.3691 | 1.05 | 13500 | 1.2085 |
78
- | 0.3785 | 1.09 | 14000 | 1.2875 |
79
- | 0.3309 | 1.13 | 14500 | 1.3301 |
80
- | 0.3818 | 1.17 | 15000 | 1.2383 |
81
- | 0.3546 | 1.21 | 15500 | 1.2575 |
82
- | 0.3568 | 1.25 | 16000 | 1.3351 |
83
- | 0.3475 | 1.29 | 16500 | 1.3030 |
84
- | 0.3801 | 1.32 | 17000 | 1.3151 |
85
- | 0.383 | 1.36 | 17500 | 1.2519 |
86
- | 0.3878 | 1.4 | 18000 | 1.2408 |
87
- | 0.3568 | 1.44 | 18500 | 1.2846 |
88
- | 0.3901 | 1.48 | 19000 | 1.1482 |
89
- | 0.3732 | 1.52 | 19500 | 1.2964 |
90
- | 0.3585 | 1.56 | 20000 | 1.2875 |
91
- | 0.3854 | 1.6 | 20500 | 1.2647 |
92
- | 0.3802 | 1.64 | 21000 | 1.2905 |
93
- | 0.3383 | 1.68 | 21500 | 1.3686 |
94
- | 0.3809 | 1.71 | 22000 | 1.2277 |
95
- | 0.3487 | 1.75 | 22500 | 1.3850 |
96
- | 0.3704 | 1.79 | 23000 | 1.2682 |
97
- | 0.3868 | 1.83 | 23500 | 1.3091 |
98
- | 0.3772 | 1.87 | 24000 | 1.2671 |
99
- | 0.3492 | 1.91 | 24500 | 1.3259 |
100
- | 0.4124 | 1.95 | 25000 | 1.2334 |
101
- | 0.3716 | 1.99 | 25500 | 1.2383 |
102
- | 0.3068 | 2.03 | 26000 | 1.4346 |
103
- | 0.2693 | 2.06 | 26500 | 1.5702 |
104
- | 0.2776 | 2.1 | 27000 | 1.4791 |
105
- | 0.2574 | 2.14 | 27500 | 1.5752 |
106
- | 0.2764 | 2.18 | 28000 | 1.6362 |
107
- | 0.3035 | 2.22 | 28500 | 1.5172 |
108
- | 0.2961 | 2.26 | 29000 | 1.4787 |
109
- | 0.3115 | 2.3 | 29500 | 1.5763 |
110
- | 0.2846 | 2.34 | 30000 | 1.4942 |
111
- | 0.2971 | 2.38 | 30500 | 1.4641 |
112
- | 0.2448 | 2.42 | 31000 | 1.6608 |
113
- | 0.2864 | 2.45 | 31500 | 1.5140 |
114
- | 0.3112 | 2.49 | 32000 | 1.5064 |
115
- | 0.2768 | 2.53 | 32500 | 1.6051 |
116
- | 0.2938 | 2.57 | 33000 | 1.6976 |
117
- | 0.2839 | 2.61 | 33500 | 1.4711 |
118
- | 0.2675 | 2.65 | 34000 | 1.5766 |
119
- | 0.273 | 2.69 | 34500 | 1.5526 |
120
- | 0.2446 | 2.73 | 35000 | 1.6282 |
121
- | 0.2921 | 2.77 | 35500 | 1.4750 |
122
- | 0.2433 | 2.81 | 36000 | 1.5918 |
123
- | 0.2634 | 2.84 | 36500 | 1.5804 |
124
- | 0.2726 | 2.88 | 37000 | 1.5430 |
125
- | 0.2678 | 2.92 | 37500 | 1.5456 |
126
- | 0.3963 | 2.96 | 38000 | 1.4429 |
127
- | 0.3874 | 3.0 | 38500 | 1.3743 |
128
- | 0.2386 | 3.04 | 39000 | 1.6718 |
129
- | 0.2666 | 3.08 | 39500 | 1.6247 |
130
- | 0.2452 | 3.12 | 40000 | 1.6553 |
131
- | 0.2684 | 3.16 | 40500 | 1.5948 |
132
- | 0.2741 | 3.19 | 41000 | 1.6774 |
133
- | 0.2915 | 3.23 | 41500 | 1.6423 |
134
- | 0.289 | 3.27 | 42000 | 1.6159 |
135
- | 0.2572 | 3.31 | 42500 | 1.6878 |
136
- | 0.2888 | 3.35 | 43000 | 1.6022 |
137
- | 0.2787 | 3.39 | 43500 | 1.6714 |
138
- | 0.2762 | 3.43 | 44000 | 1.6734 |
139
- | 0.304 | 3.47 | 44500 | 1.6225 |
140
- | 0.2964 | 3.51 | 45000 | 1.6075 |
141
- | 0.3047 | 3.55 | 45500 | 1.6200 |
142
- | 0.2929 | 3.58 | 46000 | 1.5646 |
143
- | 0.2828 | 3.62 | 46500 | 1.5764 |
144
- | 0.2882 | 3.66 | 47000 | 1.6570 |
145
- | 0.2756 | 3.7 | 47500 | 1.5030 |
146
- | 0.2741 | 3.74 | 48000 | 1.6237 |
147
- | 0.2819 | 3.78 | 48500 | 1.5456 |
148
- | 0.3243 | 3.82 | 49000 | 1.5030 |
149
- | 0.2999 | 3.86 | 49500 | 1.6339 |
150
- | 0.2867 | 3.9 | 50000 | 1.6627 |
151
- | 0.2834 | 3.94 | 50500 | 1.6580 |
152
- | 0.2784 | 3.97 | 51000 | 1.6321 |
153
- | 0.2846 | 4.01 | 51500 | 1.5986 |
154
- | 0.2059 | 4.05 | 52000 | 1.7993 |
155
- | 0.2204 | 4.09 | 52500 | 1.7942 |
156
- | 0.2144 | 4.13 | 53000 | 1.7884 |
157
- | 0.2385 | 4.17 | 53500 | 1.7064 |
158
- | 0.2225 | 4.21 | 54000 | 1.7386 |
159
- | 0.2119 | 4.25 | 54500 | 1.9515 |
160
- | 0.2033 | 4.29 | 55000 | 1.8603 |
161
- | 0.2121 | 4.32 | 55500 | 1.8144 |
162
- | 0.2489 | 4.36 | 56000 | 1.7729 |
163
- | 0.2284 | 4.4 | 56500 | 1.8237 |
164
- | 0.2319 | 4.44 | 57000 | 1.8922 |
165
- | 0.2425 | 4.48 | 57500 | 1.7491 |
166
- | 0.2535 | 4.52 | 58000 | 1.6738 |
167
- | 0.2251 | 4.56 | 58500 | 1.7717 |
168
- | 0.2449 | 4.6 | 59000 | 1.7209 |
169
- | 0.2472 | 4.64 | 59500 | 1.6438 |
170
- | 0.2179 | 4.68 | 60000 | 1.8039 |
171
- | 0.2635 | 4.71 | 60500 | 1.6948 |
172
- | 0.2301 | 4.75 | 61000 | 1.8228 |
173
- | 0.2454 | 4.79 | 61500 | 1.6865 |
174
- | 0.2146 | 4.83 | 62000 | 1.8147 |
175
- | 0.2639 | 4.87 | 62500 | 1.6340 |
176
- | 0.2488 | 4.91 | 63000 | 1.7649 |
177
- | 0.2448 | 4.95 | 63500 | 1.7029 |
178
- | 0.2373 | 4.99 | 64000 | 1.8508 |
179
- | 0.1982 | 5.03 | 64500 | 1.8193 |
180
- | 0.1676 | 5.07 | 65000 | 1.9439 |
181
- | 0.1397 | 5.1 | 65500 | 2.0506 |
182
- | 0.1829 | 5.14 | 66000 | 1.9656 |
183
- | 0.1469 | 5.18 | 66500 | 2.0149 |
184
- | 0.2015 | 5.22 | 67000 | 1.9251 |
185
- | 0.1728 | 5.26 | 67500 | 1.9232 |
186
- | 0.214 | 5.3 | 68000 | 1.7829 |
187
- | 0.1744 | 5.34 | 68500 | 2.0301 |
188
- | 0.1734 | 5.38 | 69000 | 1.9325 |
189
- | 0.2109 | 5.42 | 69500 | 1.9063 |
190
- | 0.19 | 5.45 | 70000 | 1.9691 |
191
- | 0.1947 | 5.49 | 70500 | 1.9812 |
192
- | 0.198 | 5.53 | 71000 | 1.9603 |
193
- | 0.1889 | 5.57 | 71500 | 1.9647 |
194
- | 0.198 | 5.61 | 72000 | 1.8880 |
195
- | 0.1741 | 5.65 | 72500 | 2.0263 |
196
- | 0.1775 | 5.69 | 73000 | 1.9311 |
197
- | 0.1971 | 5.73 | 73500 | 1.9250 |
198
- | 0.183 | 5.77 | 74000 | 2.0464 |
199
- | 0.1816 | 5.81 | 74500 | 1.9924 |
200
- | 0.21 | 5.84 | 75000 | 1.8805 |
201
- | 0.1999 | 5.88 | 75500 | 1.8812 |
202
- | 0.2089 | 5.92 | 76000 | 1.8398 |
203
- | 0.1945 | 5.96 | 76500 | 1.9466 |
204
- | 0.1828 | 6.0 | 77000 | 1.9279 |
205
- | 0.1423 | 6.04 | 77500 | 2.0748 |
206
- | 0.1327 | 6.08 | 78000 | 2.0871 |
207
- | 0.1297 | 6.12 | 78500 | 2.1302 |
208
- | 0.1313 | 6.16 | 79000 | 2.1704 |
209
- | 0.1463 | 6.19 | 79500 | 2.0676 |
210
- | 0.1496 | 6.23 | 80000 | 2.0896 |
211
- | 0.128 | 6.27 | 80500 | 2.2031 |
212
- | 0.1761 | 6.31 | 81000 | 2.0441 |
213
- | 0.15 | 6.35 | 81500 | 2.1346 |
214
- | 0.1787 | 6.39 | 82000 | 1.9899 |
215
- | 0.1407 | 6.43 | 82500 | 2.0616 |
216
- | 0.1366 | 6.47 | 83000 | 2.2158 |
217
- | 0.149 | 6.51 | 83500 | 2.1434 |
218
- | 0.1295 | 6.55 | 84000 | 2.2094 |
219
- | 0.1423 | 6.58 | 84500 | 2.1137 |
220
- | 0.1595 | 6.62 | 85000 | 2.0735 |
221
- | 0.1494 | 6.66 | 85500 | 2.0534 |
222
- | 0.1315 | 6.7 | 86000 | 2.1229 |
223
- | 0.1778 | 6.74 | 86500 | 2.1022 |
224
- | 0.1234 | 6.78 | 87000 | 2.1475 |
225
- | 0.1531 | 6.82 | 87500 | 2.0641 |
226
- | 0.1537 | 6.86 | 88000 | 2.0913 |
227
- | 0.1734 | 6.9 | 88500 | 2.0269 |
228
- | 0.1531 | 6.94 | 89000 | 2.0718 |
229
- | 0.1731 | 6.97 | 89500 | 2.0188 |
230
- | 0.1496 | 7.01 | 90000 | 2.2257 |
231
- | 0.1202 | 7.05 | 90500 | 2.1846 |
232
- | 0.1125 | 7.09 | 91000 | 2.3543 |
233
- | 0.1127 | 7.13 | 91500 | 2.3571 |
234
- | 0.1303 | 7.17 | 92000 | 2.2526 |
235
- | 0.1151 | 7.21 | 92500 | 2.1961 |
236
- | 0.1148 | 7.25 | 93000 | 2.2848 |
237
- | 0.1097 | 7.29 | 93500 | 2.3361 |
238
- | 0.1132 | 7.32 | 94000 | 2.3850 |
239
- | 0.0794 | 7.36 | 94500 | 2.4030 |
240
- | 0.1133 | 7.4 | 95000 | 2.2968 |
241
- | 0.1174 | 7.44 | 95500 | 2.2693 |
242
- | 0.1178 | 7.48 | 96000 | 2.2723 |
243
- | 0.0895 | 7.52 | 96500 | 2.3682 |
244
- | 0.1269 | 7.56 | 97000 | 2.2746 |
245
- | 0.1124 | 7.6 | 97500 | 2.2634 |
246
- | 0.1354 | 7.64 | 98000 | 2.2400 |
247
- | 0.1329 | 7.68 | 98500 | 2.2261 |
248
- | 0.1363 | 7.71 | 99000 | 2.2394 |
249
- | 0.1219 | 7.75 | 99500 | 2.2641 |
250
- | 0.1067 | 7.79 | 100000 | 2.3639 |
251
- | 0.1243 | 7.83 | 100500 | 2.2853 |
252
- | 0.1429 | 7.87 | 101000 | 2.2218 |
253
- | 0.1282 | 7.91 | 101500 | 2.2358 |
254
- | 0.1277 | 7.95 | 102000 | 2.2241 |
255
- | 0.143 | 7.99 | 102500 | 2.1506 |
256
- | 0.0959 | 8.03 | 103000 | 2.2565 |
257
- | 0.0911 | 8.07 | 103500 | 2.3629 |
258
- | 0.0923 | 8.1 | 104000 | 2.3459 |
259
- | 0.094 | 8.14 | 104500 | 2.3670 |
260
- | 0.0983 | 8.18 | 105000 | 2.3862 |
261
- | 0.114 | 8.22 | 105500 | 2.3531 |
262
- | 0.0783 | 8.26 | 106000 | 2.4318 |
263
- | 0.0998 | 8.3 | 106500 | 2.3581 |
264
- | 0.0627 | 8.34 | 107000 | 2.5447 |
265
- | 0.1007 | 8.38 | 107500 | 2.4340 |
266
- | 0.1046 | 8.42 | 108000 | 2.4324 |
267
- | 0.0896 | 8.45 | 108500 | 2.3896 |
268
- | 0.1194 | 8.49 | 109000 | 2.3735 |
269
- | 0.0913 | 8.53 | 109500 | 2.3917 |
270
- | 0.1212 | 8.57 | 110000 | 2.3616 |
271
- | 0.0998 | 8.61 | 110500 | 2.3847 |
272
- | 0.0902 | 8.65 | 111000 | 2.4282 |
273
- | 0.0786 | 8.69 | 111500 | 2.4669 |
274
- | 0.0944 | 8.73 | 112000 | 2.4121 |
275
- | 0.1072 | 8.77 | 112500 | 2.3918 |
276
- | 0.1386 | 8.81 | 113000 | 2.3239 |
277
- | 0.098 | 8.84 | 113500 | 2.3491 |
278
- | 0.0997 | 8.88 | 114000 | 2.3698 |
279
- | 0.1054 | 8.92 | 114500 | 2.4200 |
280
- | 0.1069 | 8.96 | 115000 | 2.3614 |
281
- | 0.1103 | 9.0 | 115500 | 2.3551 |
282
- | 0.0943 | 9.04 | 116000 | 2.4380 |
283
- | 0.0881 | 9.08 | 116500 | 2.4843 |
284
- | 0.0665 | 9.12 | 117000 | 2.5239 |
285
- | 0.0789 | 9.16 | 117500 | 2.5221 |
286
- | 0.0773 | 9.2 | 118000 | 2.5397 |
287
- | 0.0818 | 9.23 | 118500 | 2.4990 |
288
- | 0.0684 | 9.27 | 119000 | 2.5446 |
289
- | 0.0711 | 9.31 | 119500 | 2.5097 |
290
- | 0.0842 | 9.35 | 120000 | 2.5173 |
291
- | 0.0819 | 9.39 | 120500 | 2.4953 |
292
- | 0.0753 | 9.43 | 121000 | 2.5070 |
293
- | 0.09 | 9.47 | 121500 | 2.4626 |
294
- | 0.0761 | 9.51 | 122000 | 2.4711 |
295
- | 0.074 | 9.55 | 122500 | 2.4678 |
296
- | 0.0789 | 9.58 | 123000 | 2.4595 |
297
- | 0.0668 | 9.62 | 123500 | 2.4830 |
298
- | 0.0912 | 9.66 | 124000 | 2.4984 |
299
- | 0.0856 | 9.7 | 124500 | 2.4839 |
300
- | 0.0806 | 9.74 | 125000 | 2.4717 |
301
- | 0.0842 | 9.78 | 125500 | 2.4759 |
302
- | 0.0876 | 9.82 | 126000 | 2.4794 |
303
- | 0.0788 | 9.86 | 126500 | 2.4893 |
304
- | 0.0671 | 9.9 | 127000 | 2.4955 |
305
- | 0.0897 | 9.94 | 127500 | 2.4928 |
306
- | 0.0685 | 9.97 | 128000 | 2.4944 |
307
 
308
 
309
  ### Framework versions
 
17
 
18
  This model is a fine-tuned version of [makhataei/qa-persian-mdeberta-v3-base-squad2](https://huggingface.co/makhataei/qa-persian-mdeberta-v3-base-squad2) on the pquad dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 2.2905
21
 
22
  ## Model description
23
 
 
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
+ - learning_rate: 1.25e-05
40
  - train_batch_size: 5
41
  - eval_batch_size: 5
42
  - seed: 42
 
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:------:|:---------------:|
51
+ | 0.2324 | 0.04 | 500 | 1.4461 |
52
+ | 0.2076 | 0.08 | 1000 | 1.5599 |
53
+ | 0.2298 | 0.12 | 1500 | 1.6634 |
54
+ | 0.2049 | 0.16 | 2000 | 1.7076 |
55
+ | 0.201 | 0.19 | 2500 | 1.7011 |
56
+ | 0.1981 | 0.23 | 3000 | 1.6738 |
57
+ | 0.1588 | 0.27 | 3500 | 1.7657 |
58
+ | 0.1836 | 0.31 | 4000 | 1.7728 |
59
+ | 0.1958 | 0.35 | 4500 | 1.6861 |
60
+ | 0.162 | 0.39 | 5000 | 1.7768 |
61
+ | 0.1811 | 0.43 | 5500 | 1.7534 |
62
+ | 0.1775 | 0.47 | 6000 | 1.7344 |
63
+ | 0.1806 | 0.51 | 6500 | 1.7266 |
64
+ | 0.1566 | 0.55 | 7000 | 1.8093 |
65
+ | 0.1517 | 0.58 | 7500 | 1.7544 |
66
+ | 0.1146 | 0.62 | 8000 | 1.9351 |
67
+ | 0.154 | 0.66 | 8500 | 1.8271 |
68
+ | 0.323 | 0.7 | 9000 | 1.4894 |
69
+ | 0.2732 | 0.74 | 9500 | 1.4975 |
70
+ | 0.2902 | 0.78 | 10000 | 1.5645 |
71
+ | 0.2561 | 0.82 | 10500 | 1.5566 |
72
+ | 0.2754 | 0.86 | 11000 | 1.4860 |
73
+ | 0.5959 | 0.9 | 11500 | 1.1121 |
74
+ | 0.5385 | 0.94 | 12000 | 1.1161 |
75
+ | 0.5452 | 0.97 | 12500 | 1.0867 |
76
+ | 0.4369 | 1.01 | 13000 | 1.2922 |
77
+ | 0.3144 | 1.05 | 13500 | 1.3008 |
78
+ | 0.3284 | 1.09 | 14000 | 1.4088 |
79
+ | 0.292 | 1.13 | 14500 | 1.4120 |
80
+ | 0.3237 | 1.17 | 15000 | 1.3833 |
81
+ | 0.3077 | 1.21 | 15500 | 1.3974 |
82
+ | 0.3051 | 1.25 | 16000 | 1.5286 |
83
+ | 0.3015 | 1.29 | 16500 | 1.4756 |
84
+ | 0.3496 | 1.32 | 17000 | 1.4013 |
85
+ | 0.3178 | 1.36 | 17500 | 1.3949 |
86
+ | 0.3188 | 1.4 | 18000 | 1.3854 |
87
+ | 0.3176 | 1.44 | 18500 | 1.4037 |
88
+ | 0.3291 | 1.48 | 19000 | 1.3074 |
89
+ | 0.3241 | 1.52 | 19500 | 1.4160 |
90
+ | 0.3164 | 1.56 | 20000 | 1.4171 |
91
+ | 0.3118 | 1.6 | 20500 | 1.4151 |
92
+ | 0.3429 | 1.64 | 21000 | 1.4271 |
93
+ | 0.2833 | 1.68 | 21500 | 1.4760 |
94
+ | 0.3184 | 1.71 | 22000 | 1.3960 |
95
+ | 0.2887 | 1.75 | 22500 | 1.4839 |
96
+ | 0.31 | 1.79 | 23000 | 1.4136 |
97
+ | 0.3282 | 1.83 | 23500 | 1.3990 |
98
+ | 0.3153 | 1.87 | 24000 | 1.4032 |
99
+ | 0.2832 | 1.91 | 24500 | 1.4633 |
100
+ | 0.3439 | 1.95 | 25000 | 1.3783 |
101
+ | 0.3133 | 1.99 | 25500 | 1.4371 |
102
+ | 0.2562 | 2.03 | 26000 | 1.5103 |
103
+ | 0.2338 | 2.06 | 26500 | 1.6106 |
104
+ | 0.2464 | 2.1 | 27000 | 1.6430 |
105
+ | 0.2187 | 2.14 | 27500 | 1.6828 |
106
+ | 0.2353 | 2.18 | 28000 | 1.6362 |
107
+ | 0.2726 | 2.22 | 28500 | 1.5727 |
108
+ | 0.2491 | 2.26 | 29000 | 1.5545 |
109
+ | 0.2743 | 2.3 | 29500 | 1.5949 |
110
+ | 0.2419 | 2.34 | 30000 | 1.6422 |
111
+ | 0.2661 | 2.38 | 30500 | 1.5882 |
112
+ | 0.2105 | 2.42 | 31000 | 1.6584 |
113
+ | 0.2323 | 2.45 | 31500 | 1.6550 |
114
+ | 0.2778 | 2.49 | 32000 | 1.5761 |
115
+ | 0.2411 | 2.53 | 32500 | 1.6776 |
116
+ | 0.2552 | 2.57 | 33000 | 1.6707 |
117
+ | 0.2468 | 2.61 | 33500 | 1.5738 |
118
+ | 0.2398 | 2.65 | 34000 | 1.6479 |
119
+ | 0.2318 | 2.69 | 34500 | 1.6217 |
120
+ | 0.2093 | 2.73 | 35000 | 1.7018 |
121
+ | 0.2344 | 2.77 | 35500 | 1.6763 |
122
+ | 0.2243 | 2.81 | 36000 | 1.6870 |
123
+ | 0.1943 | 2.84 | 36500 | 1.6926 |
124
+ | 0.221 | 2.88 | 37000 | 1.6862 |
125
+ | 0.2256 | 2.92 | 37500 | 1.7141 |
126
+ | 0.3765 | 2.96 | 38000 | 1.5414 |
127
+ | 0.3601 | 3.0 | 38500 | 1.4698 |
128
+ | 0.2237 | 3.04 | 39000 | 1.7001 |
129
+ | 0.2426 | 3.08 | 39500 | 1.6693 |
130
+ | 0.2216 | 3.12 | 40000 | 1.7385 |
131
+ | 0.2417 | 3.16 | 40500 | 1.6941 |
132
+ | 0.2604 | 3.19 | 41000 | 1.6964 |
133
+ | 0.2762 | 3.23 | 41500 | 1.6379 |
134
+ | 0.2399 | 3.27 | 42000 | 1.6806 |
135
+ | 0.2249 | 3.31 | 42500 | 1.7414 |
136
+ | 0.2582 | 3.35 | 43000 | 1.6874 |
137
+ | 0.2524 | 3.39 | 43500 | 1.6648 |
138
+ | 0.2359 | 3.43 | 44000 | 1.7382 |
139
+ | 0.2729 | 3.47 | 44500 | 1.6762 |
140
+ | 0.2729 | 3.51 | 45000 | 1.6736 |
141
+ | 0.2478 | 3.55 | 45500 | 1.7487 |
142
+ | 0.2557 | 3.58 | 46000 | 1.6379 |
143
+ | 0.2486 | 3.62 | 46500 | 1.6746 |
144
+ | 0.2541 | 3.66 | 47000 | 1.6942 |
145
+ | 0.2613 | 3.7 | 47500 | 1.6501 |
146
+ | 0.2552 | 3.74 | 48000 | 1.6790 |
147
+ | 0.2692 | 3.78 | 48500 | 1.6246 |
148
+ | 0.2769 | 3.82 | 49000 | 1.6306 |
149
+ | 0.2542 | 3.86 | 49500 | 1.6412 |
150
+ | 0.2477 | 3.9 | 50000 | 1.6786 |
151
+ | 0.2686 | 3.94 | 50500 | 1.6677 |
152
+ | 0.2324 | 3.97 | 51000 | 1.7063 |
153
+ | 0.2509 | 4.01 | 51500 | 1.6490 |
154
+ | 0.1966 | 4.05 | 52000 | 1.8161 |
155
+ | 0.227 | 4.09 | 52500 | 1.7389 |
156
+ | 0.1881 | 4.13 | 53000 | 1.8164 |
157
+ | 0.2244 | 4.17 | 53500 | 1.7851 |
158
+ | 0.2068 | 4.21 | 54000 | 1.8039 |
159
+ | 0.2094 | 4.25 | 54500 | 1.8641 |
160
+ | 0.1783 | 4.29 | 55000 | 1.8781 |
161
+ | 0.1916 | 4.32 | 55500 | 1.8887 |
162
+ | 0.2221 | 4.36 | 56000 | 1.8061 |
163
+ | 0.2238 | 4.4 | 56500 | 1.7892 |
164
+ | 0.1996 | 4.44 | 57000 | 1.8320 |
165
+ | 0.2074 | 4.48 | 57500 | 1.8944 |
166
+ | 0.2401 | 4.52 | 58000 | 1.7803 |
167
+ | 0.2174 | 4.56 | 58500 | 1.8466 |
168
+ | 0.2258 | 4.6 | 59000 | 1.8607 |
169
+ | 0.223 | 4.64 | 59500 | 1.7695 |
170
+ | 0.185 | 4.68 | 60000 | 1.8845 |
171
+ | 0.2464 | 4.71 | 60500 | 1.8049 |
172
+ | 0.2223 | 4.75 | 61000 | 1.8136 |
173
+ | 0.2192 | 4.79 | 61500 | 1.7870 |
174
+ | 0.2191 | 4.83 | 62000 | 1.7845 |
175
+ | 0.2471 | 4.87 | 62500 | 1.7158 |
176
+ | 0.2085 | 4.91 | 63000 | 1.7816 |
177
+ | 0.2316 | 4.95 | 63500 | 1.7406 |
178
+ | 0.2449 | 4.99 | 64000 | 1.7465 |
179
+ | 0.196 | 5.03 | 64500 | 1.8431 |
180
+ | 0.1851 | 5.07 | 65000 | 1.8751 |
181
+ | 0.1393 | 5.1 | 65500 | 1.9697 |
182
+ | 0.1752 | 5.14 | 66000 | 1.9985 |
183
+ | 0.1438 | 5.18 | 66500 | 2.0071 |
184
+ | 0.2112 | 5.22 | 67000 | 1.9434 |
185
+ | 0.1715 | 5.26 | 67500 | 1.9735 |
186
+ | 0.1982 | 5.3 | 68000 | 1.9319 |
187
+ | 0.1768 | 5.34 | 68500 | 1.9622 |
188
+ | 0.1872 | 5.38 | 69000 | 1.8810 |
189
+ | 0.2059 | 5.42 | 69500 | 1.8445 |
190
+ | 0.1903 | 5.45 | 70000 | 1.8744 |
191
+ | 0.1835 | 5.49 | 70500 | 1.9283 |
192
+ | 0.1843 | 5.53 | 71000 | 1.9938 |
193
+ | 0.1727 | 5.57 | 71500 | 1.9865 |
194
+ | 0.1994 | 5.61 | 72000 | 1.9390 |
195
+ | 0.172 | 5.65 | 72500 | 2.0077 |
196
+ | 0.163 | 5.69 | 73000 | 1.9794 |
197
+ | 0.196 | 5.73 | 73500 | 1.9307 |
198
+ | 0.183 | 5.77 | 74000 | 1.9463 |
199
+ | 0.1764 | 5.81 | 74500 | 1.9981 |
200
+ | 0.1951 | 5.84 | 75000 | 1.9378 |
201
+ | 0.2014 | 5.88 | 75500 | 1.9199 |
202
+ | 0.1766 | 5.92 | 76000 | 1.9824 |
203
+ | 0.1996 | 5.96 | 76500 | 1.9309 |
204
+ | 0.1919 | 6.0 | 77000 | 1.9458 |
205
+ | 0.1664 | 6.04 | 77500 | 2.0603 |
206
+ | 0.1517 | 6.08 | 78000 | 2.0253 |
207
+ | 0.1461 | 6.12 | 78500 | 2.1310 |
208
+ | 0.1379 | 6.16 | 79000 | 2.1506 |
209
+ | 0.1532 | 6.19 | 79500 | 2.0715 |
210
+ | 0.1546 | 6.23 | 80000 | 2.1345 |
211
+ | 0.156 | 6.27 | 80500 | 2.1732 |
212
+ | 0.1648 | 6.31 | 81000 | 2.1075 |
213
+ | 0.1494 | 6.35 | 81500 | 2.1547 |
214
+ | 0.1741 | 6.39 | 82000 | 2.0228 |
215
+ | 0.1391 | 6.43 | 82500 | 2.0426 |
216
+ | 0.1541 | 6.47 | 83000 | 2.0919 |
217
+ | 0.1609 | 6.51 | 83500 | 2.1206 |
218
+ | 0.159 | 6.55 | 84000 | 2.0798 |
219
+ | 0.153 | 6.58 | 84500 | 2.1216 |
220
+ | 0.1822 | 6.62 | 85000 | 2.1276 |
221
+ | 0.1466 | 6.66 | 85500 | 2.1533 |
222
+ | 0.1583 | 6.7 | 86000 | 2.1250 |
223
+ | 0.2012 | 6.74 | 86500 | 2.0619 |
224
+ | 0.1501 | 6.78 | 87000 | 2.0804 |
225
+ | 0.1748 | 6.82 | 87500 | 2.0684 |
226
+ | 0.1571 | 6.86 | 88000 | 2.0902 |
227
+ | 0.169 | 6.9 | 88500 | 2.0587 |
228
+ | 0.183 | 6.94 | 89000 | 2.0435 |
229
+ | 0.1891 | 6.97 | 89500 | 1.9954 |
230
+ | 0.1647 | 7.01 | 90000 | 2.0333 |
231
+ | 0.1511 | 7.05 | 90500 | 2.0657 |
232
+ | 0.1345 | 7.09 | 91000 | 2.1329 |
233
+ | 0.1394 | 7.13 | 91500 | 2.1481 |
234
+ | 0.133 | 7.17 | 92000 | 2.1518 |
235
+ | 0.1508 | 7.21 | 92500 | 2.1051 |
236
+ | 0.1493 | 7.25 | 93000 | 2.1017 |
237
+ | 0.148 | 7.29 | 93500 | 2.0833 |
238
+ | 0.1416 | 7.32 | 94000 | 2.1286 |
239
+ | 0.1185 | 7.36 | 94500 | 2.1419 |
240
+ | 0.1274 | 7.4 | 95000 | 2.1302 |
241
+ | 0.1326 | 7.44 | 95500 | 2.1720 |
242
+ | 0.1378 | 7.48 | 96000 | 2.1826 |
243
+ | 0.1117 | 7.52 | 96500 | 2.2190 |
244
+ | 0.1454 | 7.56 | 97000 | 2.1884 |
245
+ | 0.1288 | 7.6 | 97500 | 2.2184 |
246
+ | 0.1605 | 7.64 | 98000 | 2.1831 |
247
+ | 0.1492 | 7.68 | 98500 | 2.1518 |
248
+ | 0.1573 | 7.71 | 99000 | 2.1452 |
249
+ | 0.1496 | 7.75 | 99500 | 2.1474 |
250
+ | 0.1382 | 7.79 | 100000 | 2.1298 |
251
+ | 0.1368 | 7.83 | 100500 | 2.1231 |
252
+ | 0.1699 | 7.87 | 101000 | 2.0813 |
253
+ | 0.153 | 7.91 | 101500 | 2.1481 |
254
+ | 0.1412 | 7.95 | 102000 | 2.1022 |
255
+ | 0.1663 | 7.99 | 102500 | 2.0768 |
256
+ | 0.1055 | 8.03 | 103000 | 2.1489 |
257
+ | 0.1165 | 8.07 | 103500 | 2.1983 |
258
+ | 0.1165 | 8.1 | 104000 | 2.2075 |
259
+ | 0.1172 | 8.14 | 104500 | 2.1885 |
260
+ | 0.1222 | 8.18 | 105000 | 2.1968 |
261
+ | 0.1407 | 8.22 | 105500 | 2.2263 |
262
+ | 0.1048 | 8.26 | 106000 | 2.2442 |
263
+ | 0.1293 | 8.3 | 106500 | 2.2103 |
264
+ | 0.0964 | 8.34 | 107000 | 2.2572 |
265
+ | 0.1516 | 8.38 | 107500 | 2.2265 |
266
+ | 0.1415 | 8.42 | 108000 | 2.2039 |
267
+ | 0.1135 | 8.45 | 108500 | 2.2160 |
268
+ | 0.1431 | 8.49 | 109000 | 2.2018 |
269
+ | 0.1161 | 8.53 | 109500 | 2.2555 |
270
+ | 0.1705 | 8.57 | 110000 | 2.2277 |
271
+ | 0.1299 | 8.61 | 110500 | 2.2269 |
272
+ | 0.1354 | 8.65 | 111000 | 2.1957 |
273
+ | 0.0906 | 8.69 | 111500 | 2.2220 |
274
+ | 0.1186 | 8.73 | 112000 | 2.2277 |
275
+ | 0.1482 | 8.77 | 112500 | 2.1811 |
276
+ | 0.1628 | 8.81 | 113000 | 2.1620 |
277
+ | 0.1141 | 8.84 | 113500 | 2.1916 |
278
+ | 0.0998 | 8.88 | 114000 | 2.2243 |
279
+ | 0.1227 | 8.92 | 114500 | 2.2303 |
280
+ | 0.1434 | 8.96 | 115000 | 2.2154 |
281
+ | 0.1358 | 9.0 | 115500 | 2.1964 |
282
+ | 0.1263 | 9.04 | 116000 | 2.2122 |
283
+ | 0.0955 | 9.08 | 116500 | 2.2367 |
284
+ | 0.1016 | 9.12 | 117000 | 2.2425 |
285
+ | 0.1106 | 9.16 | 117500 | 2.2399 |
286
+ | 0.1081 | 9.2 | 118000 | 2.2621 |
287
+ | 0.1318 | 9.23 | 118500 | 2.2402 |
288
+ | 0.1174 | 9.27 | 119000 | 2.2364 |
289
+ | 0.1071 | 9.31 | 119500 | 2.2163 |
290
+ | 0.1049 | 9.35 | 120000 | 2.2512 |
291
+ | 0.1289 | 9.39 | 120500 | 2.2354 |
292
+ | 0.1214 | 9.43 | 121000 | 2.2384 |
293
+ | 0.1149 | 9.47 | 121500 | 2.2346 |
294
+ | 0.0977 | 9.51 | 122000 | 2.2553 |
295
+ | 0.1088 | 9.55 | 122500 | 2.2676 |
296
+ | 0.101 | 9.58 | 123000 | 2.2732 |
297
+ | 0.1135 | 9.62 | 123500 | 2.2706 |
298
+ | 0.1168 | 9.66 | 124000 | 2.2768 |
299
+ | 0.1164 | 9.7 | 124500 | 2.2803 |
300
+ | 0.113 | 9.74 | 125000 | 2.2813 |
301
+ | 0.0944 | 9.78 | 125500 | 2.2862 |
302
+ | 0.1189 | 9.82 | 126000 | 2.2904 |
303
+ | 0.1059 | 9.86 | 126500 | 2.2905 |
304
+ | 0.1108 | 9.9 | 127000 | 2.2920 |
305
+ | 0.1195 | 9.94 | 127500 | 2.2911 |
306
+ | 0.1009 | 9.97 | 128000 | 2.2905 |
307
 
308
 
309
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9b36c353340ab62bc723e54e2cdad6ddd8807ff03a25921de69160fa82671567
3
  size 1112905680
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a05f9de5d189d521a82802c060674369ad8aa361f51de2c94b39a58e5c15e01a
3
  size 1112905680
runs/Nov28_23-41-44_Software-AI/events.out.tfevents.1701202304.Software-AI.10944.3 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0cf273f9f0d3ba024856d0271d623219f296b22e60da44463244f3397b625264
3
+ size 116312
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c38cacb29e9571f92b8a98fd4f574e038e109cc118698443e3e39c8ced5d8c86
3
  size 4219
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e4d30d6acbd5342517077482308f2a5bc4430411e09aa03161dcf0bb68f4f7aa
3
  size 4219