End of training
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
17 |
|
18 |
This model is a fine-tuned version of [makhataei/qa-persian-mdeberta-v3-base-squad2](https://huggingface.co/makhataei/qa-persian-mdeberta-v3-base-squad2) on the pquad dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
-
- Loss: 2.
|
21 |
|
22 |
## Model description
|
23 |
|
@@ -36,7 +36,7 @@ More information needed
|
|
36 |
### Training hyperparameters
|
37 |
|
38 |
The following hyperparameters were used during training:
|
39 |
-
- learning_rate:
|
40 |
- train_batch_size: 5
|
41 |
- eval_batch_size: 5
|
42 |
- seed: 42
|
@@ -48,262 +48,262 @@ The following hyperparameters were used during training:
|
|
48 |
|
49 |
| Training Loss | Epoch | Step | Validation Loss |
|
50 |
|:-------------:|:-----:|:------:|:---------------:|
|
51 |
-
|
|
52 |
-
|
|
53 |
-
|
|
54 |
-
|
|
55 |
-
|
|
56 |
-
|
|
57 |
-
|
|
58 |
-
|
|
59 |
-
|
|
60 |
-
|
|
61 |
-
|
|
62 |
-
|
|
63 |
-
|
|
64 |
-
|
|
65 |
-
|
|
66 |
-
|
|
67 |
-
|
|
68 |
-
|
|
69 |
-
|
|
70 |
-
|
|
71 |
-
| 0.
|
72 |
-
|
|
73 |
-
|
|
74 |
-
| 0.
|
75 |
-
|
|
76 |
-
| 0.
|
77 |
-
| 0.
|
78 |
-
| 0.
|
79 |
-
| 0.
|
80 |
-
| 0.
|
81 |
-
| 0.
|
82 |
-
| 0.
|
83 |
-
| 0.
|
84 |
-
| 0.
|
85 |
-
| 0.
|
86 |
-
| 0.
|
87 |
-
| 0.
|
88 |
-
| 0.
|
89 |
-
| 0.
|
90 |
-
| 0.
|
91 |
-
| 0.
|
92 |
-
| 0.
|
93 |
-
| 0.
|
94 |
-
| 0.
|
95 |
-
| 0.
|
96 |
-
| 0.
|
97 |
-
| 0.
|
98 |
-
| 0.
|
99 |
-
| 0.
|
100 |
-
| 0.
|
101 |
-
| 0.
|
102 |
-
| 0.
|
103 |
-
| 0.
|
104 |
-
| 0.
|
105 |
-
| 0.
|
106 |
-
| 0.
|
107 |
-
| 0.
|
108 |
-
| 0.
|
109 |
-
| 0.
|
110 |
-
| 0.
|
111 |
-
| 0.
|
112 |
-
| 0.
|
113 |
-
| 0.
|
114 |
-
| 0.
|
115 |
-
| 0.
|
116 |
-
| 0.
|
117 |
-
| 0.
|
118 |
-
| 0.
|
119 |
-
| 0.
|
120 |
-
| 0.
|
121 |
-
| 0.
|
122 |
-
| 0.
|
123 |
-
| 0.
|
124 |
-
| 0.
|
125 |
-
| 0.
|
126 |
-
| 0.
|
127 |
-
| 0.
|
128 |
-
| 0.
|
129 |
-
| 0.
|
130 |
-
| 0.
|
131 |
-
| 0.
|
132 |
-
| 0.
|
133 |
-
| 0.
|
134 |
-
| 0.
|
135 |
-
| 0.
|
136 |
-
| 0.
|
137 |
-
| 0.
|
138 |
-
| 0.
|
139 |
-
| 0.
|
140 |
-
| 0.
|
141 |
-
| 0.
|
142 |
-
| 0.
|
143 |
-
| 0.
|
144 |
-
| 0.
|
145 |
-
| 0.
|
146 |
-
| 0.
|
147 |
-
| 0.
|
148 |
-
| 0.
|
149 |
-
| 0.
|
150 |
-
| 0.
|
151 |
-
| 0.
|
152 |
-
| 0.
|
153 |
-
| 0.
|
154 |
-
| 0.
|
155 |
-
| 0.
|
156 |
-
| 0.
|
157 |
-
| 0.
|
158 |
-
| 0.
|
159 |
-
| 0.
|
160 |
-
| 0.
|
161 |
-
| 0.
|
162 |
-
| 0.
|
163 |
-
| 0.
|
164 |
-
| 0.
|
165 |
-
| 0.
|
166 |
-
| 0.
|
167 |
-
| 0.
|
168 |
-
| 0.
|
169 |
-
| 0.
|
170 |
-
| 0.
|
171 |
-
| 0.
|
172 |
-
| 0.
|
173 |
-
| 0.
|
174 |
-
| 0.
|
175 |
-
| 0.
|
176 |
-
| 0.
|
177 |
-
| 0.
|
178 |
-
| 0.
|
179 |
-
| 0.
|
180 |
-
| 0.
|
181 |
-
| 0.
|
182 |
-
| 0.
|
183 |
-
| 0.
|
184 |
-
| 0.
|
185 |
-
| 0.
|
186 |
-
| 0.
|
187 |
-
| 0.
|
188 |
-
| 0.
|
189 |
-
| 0.
|
190 |
-
| 0.
|
191 |
-
| 0.
|
192 |
-
| 0.
|
193 |
-
| 0.
|
194 |
-
| 0.
|
195 |
-
| 0.
|
196 |
-
| 0.
|
197 |
-
| 0.
|
198 |
-
| 0.
|
199 |
-
| 0.
|
200 |
-
| 0.
|
201 |
-
| 0.
|
202 |
-
| 0.
|
203 |
-
| 0.
|
204 |
-
| 0.
|
205 |
-
| 0.
|
206 |
-
| 0.
|
207 |
-
| 0.
|
208 |
-
| 0.
|
209 |
-
| 0.
|
210 |
-
| 0.
|
211 |
-
| 0.
|
212 |
-
| 0.
|
213 |
-
| 0.
|
214 |
-
| 0.
|
215 |
-
| 0.
|
216 |
-
| 0.
|
217 |
-
| 0.
|
218 |
-
| 0.
|
219 |
-
| 0.
|
220 |
-
| 0.
|
221 |
-
| 0.
|
222 |
-
| 0.
|
223 |
-
| 0.
|
224 |
-
| 0.
|
225 |
-
| 0.
|
226 |
-
| 0.
|
227 |
-
| 0.
|
228 |
-
| 0.
|
229 |
-
| 0.
|
230 |
-
| 0.
|
231 |
-
| 0.
|
232 |
-
| 0.
|
233 |
-
| 0.
|
234 |
-
| 0.
|
235 |
-
| 0.
|
236 |
-
| 0.
|
237 |
-
| 0.
|
238 |
-
| 0.
|
239 |
-
| 0.
|
240 |
-
| 0.
|
241 |
-
| 0.
|
242 |
-
| 0.
|
243 |
-
| 0.
|
244 |
-
| 0.
|
245 |
-
| 0.
|
246 |
-
| 0.
|
247 |
-
| 0.
|
248 |
-
| 0.
|
249 |
-
| 0.
|
250 |
-
| 0.
|
251 |
-
| 0.
|
252 |
-
| 0.
|
253 |
-
| 0.
|
254 |
-
| 0.
|
255 |
-
| 0.
|
256 |
-
| 0.
|
257 |
-
| 0.
|
258 |
-
| 0.
|
259 |
-
| 0.
|
260 |
-
| 0.
|
261 |
-
| 0.
|
262 |
-
| 0.
|
263 |
-
| 0.
|
264 |
-
| 0.
|
265 |
-
| 0.
|
266 |
-
| 0.
|
267 |
-
| 0.
|
268 |
-
| 0.
|
269 |
-
| 0.
|
270 |
-
| 0.
|
271 |
-
| 0.
|
272 |
-
| 0.
|
273 |
-
| 0.
|
274 |
-
| 0.
|
275 |
-
| 0.
|
276 |
-
| 0.
|
277 |
-
| 0.
|
278 |
-
| 0.
|
279 |
-
| 0.
|
280 |
-
| 0.
|
281 |
-
| 0.
|
282 |
-
| 0.
|
283 |
-
| 0.
|
284 |
-
| 0.
|
285 |
-
| 0.
|
286 |
-
| 0.
|
287 |
-
| 0.
|
288 |
-
| 0.
|
289 |
-
| 0.
|
290 |
-
| 0.
|
291 |
-
| 0.
|
292 |
-
| 0.
|
293 |
-
| 0.
|
294 |
-
| 0.
|
295 |
-
| 0.
|
296 |
-
| 0.
|
297 |
-
| 0.
|
298 |
-
| 0.
|
299 |
-
| 0.
|
300 |
-
| 0.
|
301 |
-
| 0.
|
302 |
-
| 0.
|
303 |
-
| 0.
|
304 |
-
| 0.
|
305 |
-
| 0.
|
306 |
-
| 0.
|
307 |
|
308 |
|
309 |
### Framework versions
|
|
|
17 |
|
18 |
This model is a fine-tuned version of [makhataei/qa-persian-mdeberta-v3-base-squad2](https://huggingface.co/makhataei/qa-persian-mdeberta-v3-base-squad2) on the pquad dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
+
- Loss: 2.4713
|
21 |
|
22 |
## Model description
|
23 |
|
|
|
36 |
### Training hyperparameters
|
37 |
|
38 |
The following hyperparameters were used during training:
|
39 |
+
- learning_rate: 5e-05
|
40 |
- train_batch_size: 5
|
41 |
- eval_batch_size: 5
|
42 |
- seed: 42
|
|
|
48 |
|
49 |
| Training Loss | Epoch | Step | Validation Loss |
|
50 |
|:-------------:|:-----:|:------:|:---------------:|
|
51 |
+
| 0.6303 | 0.04 | 500 | 1.0524 |
|
52 |
+
| 0.6072 | 0.08 | 1000 | 1.1925 |
|
53 |
+
| 0.6577 | 0.12 | 1500 | 0.9827 |
|
54 |
+
| 0.6057 | 0.16 | 2000 | 1.0477 |
|
55 |
+
| 0.6033 | 0.19 | 2500 | 1.0451 |
|
56 |
+
| 0.6192 | 0.23 | 3000 | 1.0471 |
|
57 |
+
| 0.6037 | 0.27 | 3500 | 1.0151 |
|
58 |
+
| 0.6043 | 0.31 | 4000 | 1.0837 |
|
59 |
+
| 0.5914 | 0.35 | 4500 | 1.1688 |
|
60 |
+
| 0.5731 | 0.39 | 5000 | 1.1192 |
|
61 |
+
| 0.6051 | 0.43 | 5500 | 1.0116 |
|
62 |
+
| 0.5836 | 0.47 | 6000 | 1.0353 |
|
63 |
+
| 0.6039 | 0.51 | 6500 | 1.0284 |
|
64 |
+
| 0.6132 | 0.55 | 7000 | 1.0560 |
|
65 |
+
| 0.5856 | 0.58 | 7500 | 1.1219 |
|
66 |
+
| 0.5806 | 0.62 | 8000 | 1.0780 |
|
67 |
+
| 0.6125 | 0.66 | 8500 | 0.9812 |
|
68 |
+
| 0.6356 | 0.7 | 9000 | 1.0886 |
|
69 |
+
| 0.5905 | 0.74 | 9500 | 1.0187 |
|
70 |
+
| 0.6037 | 0.78 | 10000 | 1.1142 |
|
71 |
+
| 0.5453 | 0.82 | 10500 | 1.1535 |
|
72 |
+
| 0.6065 | 0.86 | 11000 | 1.0309 |
|
73 |
+
| 0.5638 | 0.9 | 11500 | 1.1029 |
|
74 |
+
| 0.5774 | 0.94 | 12000 | 1.1913 |
|
75 |
+
| 0.6295 | 0.97 | 12500 | 1.1206 |
|
76 |
+
| 0.5427 | 1.01 | 13000 | 1.2944 |
|
77 |
+
| 0.468 | 1.05 | 13500 | 1.2117 |
|
78 |
+
| 0.4965 | 1.09 | 14000 | 1.1481 |
|
79 |
+
| 0.4549 | 1.13 | 14500 | 1.3192 |
|
80 |
+
| 0.4811 | 1.17 | 15000 | 1.1514 |
|
81 |
+
| 0.4507 | 1.21 | 15500 | 1.2152 |
|
82 |
+
| 0.4845 | 1.25 | 16000 | 1.2142 |
|
83 |
+
| 0.4529 | 1.29 | 16500 | 1.2136 |
|
84 |
+
| 0.5138 | 1.32 | 17000 | 1.2248 |
|
85 |
+
| 0.4843 | 1.36 | 17500 | 1.1198 |
|
86 |
+
| 0.501 | 1.4 | 18000 | 1.1145 |
|
87 |
+
| 0.4898 | 1.44 | 18500 | 1.0842 |
|
88 |
+
| 0.505 | 1.48 | 19000 | 1.1711 |
|
89 |
+
| 0.4979 | 1.52 | 19500 | 1.1617 |
|
90 |
+
| 0.4836 | 1.56 | 20000 | 1.1651 |
|
91 |
+
| 0.498 | 1.6 | 20500 | 1.1534 |
|
92 |
+
| 0.51 | 1.64 | 21000 | 1.1642 |
|
93 |
+
| 0.4648 | 1.68 | 21500 | 1.2294 |
|
94 |
+
| 0.4955 | 1.71 | 22000 | 1.1252 |
|
95 |
+
| 0.4645 | 1.75 | 22500 | 1.3847 |
|
96 |
+
| 0.503 | 1.79 | 23000 | 1.1256 |
|
97 |
+
| 0.5167 | 1.83 | 23500 | 1.2102 |
|
98 |
+
| 0.5039 | 1.87 | 24000 | 1.2769 |
|
99 |
+
| 0.4669 | 1.91 | 24500 | 1.2660 |
|
100 |
+
| 0.5273 | 1.95 | 25000 | 1.1858 |
|
101 |
+
| 0.5167 | 1.99 | 25500 | 1.0963 |
|
102 |
+
| 0.4213 | 2.03 | 26000 | 1.3819 |
|
103 |
+
| 0.4117 | 2.06 | 26500 | 1.3915 |
|
104 |
+
| 0.4107 | 2.1 | 27000 | 1.2944 |
|
105 |
+
| 0.3568 | 2.14 | 27500 | 1.4241 |
|
106 |
+
| 0.3969 | 2.18 | 28000 | 1.4366 |
|
107 |
+
| 0.4176 | 2.22 | 28500 | 1.4275 |
|
108 |
+
| 0.4129 | 2.26 | 29000 | 1.3802 |
|
109 |
+
| 0.4394 | 2.3 | 29500 | 1.3639 |
|
110 |
+
| 0.3945 | 2.34 | 30000 | 1.4072 |
|
111 |
+
| 0.3682 | 2.38 | 30500 | 1.3478 |
|
112 |
+
| 0.3564 | 2.42 | 31000 | 1.4649 |
|
113 |
+
| 0.3839 | 2.45 | 31500 | 1.6282 |
|
114 |
+
| 0.4148 | 2.49 | 32000 | 1.4738 |
|
115 |
+
| 0.3998 | 2.53 | 32500 | 1.4206 |
|
116 |
+
| 0.3987 | 2.57 | 33000 | 1.3601 |
|
117 |
+
| 0.3881 | 2.61 | 33500 | 1.4010 |
|
118 |
+
| 0.3852 | 2.65 | 34000 | 1.3936 |
|
119 |
+
| 0.3703 | 2.69 | 34500 | 1.4996 |
|
120 |
+
| 0.3955 | 2.73 | 35000 | 1.4243 |
|
121 |
+
| 0.414 | 2.77 | 35500 | 1.3599 |
|
122 |
+
| 0.3723 | 2.81 | 36000 | 1.4481 |
|
123 |
+
| 0.3927 | 2.84 | 36500 | 1.4327 |
|
124 |
+
| 0.3993 | 2.88 | 37000 | 1.3312 |
|
125 |
+
| 0.4074 | 2.92 | 37500 | 1.3248 |
|
126 |
+
| 0.4978 | 2.96 | 38000 | 1.2219 |
|
127 |
+
| 0.4957 | 3.0 | 38500 | 1.1998 |
|
128 |
+
| 0.3192 | 3.04 | 39000 | 1.5531 |
|
129 |
+
| 0.385 | 3.08 | 39500 | 1.3462 |
|
130 |
+
| 0.351 | 3.12 | 40000 | 1.3456 |
|
131 |
+
| 0.3584 | 3.16 | 40500 | 1.4219 |
|
132 |
+
| 0.3696 | 3.19 | 41000 | 1.5244 |
|
133 |
+
| 0.3872 | 3.23 | 41500 | 1.5260 |
|
134 |
+
| 0.3916 | 3.27 | 42000 | 1.3642 |
|
135 |
+
| 0.3598 | 3.31 | 42500 | 1.5210 |
|
136 |
+
| 0.3749 | 3.35 | 43000 | 1.3730 |
|
137 |
+
| 0.3781 | 3.39 | 43500 | 1.3904 |
|
138 |
+
| 0.38 | 3.43 | 44000 | 1.3847 |
|
139 |
+
| 0.4019 | 3.47 | 44500 | 1.3194 |
|
140 |
+
| 0.3876 | 3.51 | 45000 | 1.4494 |
|
141 |
+
| 0.3916 | 3.55 | 45500 | 1.5578 |
|
142 |
+
| 0.3895 | 3.58 | 46000 | 1.4429 |
|
143 |
+
| 0.3647 | 3.62 | 46500 | 1.3499 |
|
144 |
+
| 0.3848 | 3.66 | 47000 | 1.4542 |
|
145 |
+
| 0.3748 | 3.7 | 47500 | 1.2933 |
|
146 |
+
| 0.3892 | 3.74 | 48000 | 1.3987 |
|
147 |
+
| 0.3807 | 3.78 | 48500 | 1.4392 |
|
148 |
+
| 0.4057 | 3.82 | 49000 | 1.3771 |
|
149 |
+
| 0.3922 | 3.86 | 49500 | 1.3830 |
|
150 |
+
| 0.3976 | 3.9 | 50000 | 1.2871 |
|
151 |
+
| 0.383 | 3.94 | 50500 | 1.4306 |
|
152 |
+
| 0.3771 | 3.97 | 51000 | 1.3849 |
|
153 |
+
| 0.3793 | 4.01 | 51500 | 1.5489 |
|
154 |
+
| 0.2792 | 4.05 | 52000 | 1.5708 |
|
155 |
+
| 0.2859 | 4.09 | 52500 | 1.5634 |
|
156 |
+
| 0.2839 | 4.13 | 53000 | 1.6146 |
|
157 |
+
| 0.3118 | 4.17 | 53500 | 1.5593 |
|
158 |
+
| 0.3248 | 4.21 | 54000 | 1.5015 |
|
159 |
+
| 0.2981 | 4.25 | 54500 | 1.5262 |
|
160 |
+
| 0.2697 | 4.29 | 55000 | 1.6662 |
|
161 |
+
| 0.2929 | 4.32 | 55500 | 1.6073 |
|
162 |
+
| 0.3233 | 4.36 | 56000 | 1.4935 |
|
163 |
+
| 0.2944 | 4.4 | 56500 | 1.5488 |
|
164 |
+
| 0.3021 | 4.44 | 57000 | 1.5612 |
|
165 |
+
| 0.3162 | 4.48 | 57500 | 1.6165 |
|
166 |
+
| 0.337 | 4.52 | 58000 | 1.4389 |
|
167 |
+
| 0.3071 | 4.56 | 58500 | 1.6181 |
|
168 |
+
| 0.346 | 4.6 | 59000 | 1.5063 |
|
169 |
+
| 0.3359 | 4.64 | 59500 | 1.5319 |
|
170 |
+
| 0.283 | 4.68 | 60000 | 1.5716 |
|
171 |
+
| 0.3184 | 4.71 | 60500 | 1.5787 |
|
172 |
+
| 0.2911 | 4.75 | 61000 | 1.6882 |
|
173 |
+
| 0.3325 | 4.79 | 61500 | 1.5195 |
|
174 |
+
| 0.3223 | 4.83 | 62000 | 1.6573 |
|
175 |
+
| 0.3225 | 4.87 | 62500 | 1.4265 |
|
176 |
+
| 0.3028 | 4.91 | 63000 | 1.5742 |
|
177 |
+
| 0.318 | 4.95 | 63500 | 1.5170 |
|
178 |
+
| 0.3047 | 4.99 | 64000 | 1.5051 |
|
179 |
+
| 0.2552 | 5.03 | 64500 | 1.7450 |
|
180 |
+
| 0.2326 | 5.07 | 65000 | 1.6757 |
|
181 |
+
| 0.2174 | 5.1 | 65500 | 1.9674 |
|
182 |
+
| 0.2423 | 5.14 | 66000 | 1.8576 |
|
183 |
+
| 0.2066 | 5.18 | 66500 | 1.7914 |
|
184 |
+
| 0.2717 | 5.22 | 67000 | 1.8060 |
|
185 |
+
| 0.2353 | 5.26 | 67500 | 1.7933 |
|
186 |
+
| 0.2499 | 5.3 | 68000 | 1.7655 |
|
187 |
+
| 0.2415 | 5.34 | 68500 | 1.9094 |
|
188 |
+
| 0.2541 | 5.38 | 69000 | 1.7136 |
|
189 |
+
| 0.2616 | 5.42 | 69500 | 1.7428 |
|
190 |
+
| 0.2402 | 5.45 | 70000 | 1.8088 |
|
191 |
+
| 0.2543 | 5.49 | 70500 | 1.6588 |
|
192 |
+
| 0.2756 | 5.53 | 71000 | 1.6765 |
|
193 |
+
| 0.2447 | 5.57 | 71500 | 1.8263 |
|
194 |
+
| 0.2643 | 5.61 | 72000 | 1.6329 |
|
195 |
+
| 0.224 | 5.65 | 72500 | 1.7456 |
|
196 |
+
| 0.2385 | 5.69 | 73000 | 1.7220 |
|
197 |
+
| 0.2488 | 5.73 | 73500 | 1.5867 |
|
198 |
+
| 0.2424 | 5.77 | 74000 | 1.7738 |
|
199 |
+
| 0.2618 | 5.81 | 74500 | 1.7264 |
|
200 |
+
| 0.2498 | 5.84 | 75000 | 1.6741 |
|
201 |
+
| 0.264 | 5.88 | 75500 | 1.6714 |
|
202 |
+
| 0.2364 | 5.92 | 76000 | 1.6306 |
|
203 |
+
| 0.2377 | 5.96 | 76500 | 1.8281 |
|
204 |
+
| 0.2475 | 6.0 | 77000 | 1.6020 |
|
205 |
+
| 0.1762 | 6.04 | 77500 | 1.9256 |
|
206 |
+
| 0.182 | 6.08 | 78000 | 1.9239 |
|
207 |
+
| 0.1714 | 6.12 | 78500 | 1.9346 |
|
208 |
+
| 0.1702 | 6.16 | 79000 | 2.0071 |
|
209 |
+
| 0.1885 | 6.19 | 79500 | 1.8634 |
|
210 |
+
| 0.1933 | 6.23 | 80000 | 2.0296 |
|
211 |
+
| 0.1973 | 6.27 | 80500 | 1.8691 |
|
212 |
+
| 0.1698 | 6.31 | 81000 | 1.9280 |
|
213 |
+
| 0.1935 | 6.35 | 81500 | 1.9555 |
|
214 |
+
| 0.1892 | 6.39 | 82000 | 1.9595 |
|
215 |
+
| 0.1879 | 6.43 | 82500 | 1.9741 |
|
216 |
+
| 0.1939 | 6.47 | 83000 | 2.0260 |
|
217 |
+
| 0.1928 | 6.51 | 83500 | 2.0924 |
|
218 |
+
| 0.1906 | 6.55 | 84000 | 1.9643 |
|
219 |
+
| 0.1729 | 6.58 | 84500 | 2.1318 |
|
220 |
+
| 0.2198 | 6.62 | 85000 | 1.8794 |
|
221 |
+
| 0.1941 | 6.66 | 85500 | 1.9834 |
|
222 |
+
| 0.1798 | 6.7 | 86000 | 2.0396 |
|
223 |
+
| 0.2141 | 6.74 | 86500 | 1.8159 |
|
224 |
+
| 0.1748 | 6.78 | 87000 | 2.0235 |
|
225 |
+
| 0.2038 | 6.82 | 87500 | 1.9760 |
|
226 |
+
| 0.1948 | 6.86 | 88000 | 1.9607 |
|
227 |
+
| 0.209 | 6.9 | 88500 | 1.9526 |
|
228 |
+
| 0.1951 | 6.94 | 89000 | 2.0364 |
|
229 |
+
| 0.2238 | 6.97 | 89500 | 1.8029 |
|
230 |
+
| 0.1913 | 7.01 | 90000 | 2.0869 |
|
231 |
+
| 0.153 | 7.05 | 90500 | 2.1914 |
|
232 |
+
| 0.1393 | 7.09 | 91000 | 2.2019 |
|
233 |
+
| 0.145 | 7.13 | 91500 | 2.1408 |
|
234 |
+
| 0.1483 | 7.17 | 92000 | 2.1024 |
|
235 |
+
| 0.1396 | 7.21 | 92500 | 2.1224 |
|
236 |
+
| 0.1313 | 7.25 | 93000 | 2.1517 |
|
237 |
+
| 0.1288 | 7.29 | 93500 | 2.2002 |
|
238 |
+
| 0.1569 | 7.32 | 94000 | 2.1955 |
|
239 |
+
| 0.1291 | 7.36 | 94500 | 2.3081 |
|
240 |
+
| 0.1702 | 7.4 | 95000 | 2.0735 |
|
241 |
+
| 0.127 | 7.44 | 95500 | 2.0001 |
|
242 |
+
| 0.1503 | 7.48 | 96000 | 2.1695 |
|
243 |
+
| 0.1356 | 7.52 | 96500 | 2.1271 |
|
244 |
+
| 0.1466 | 7.56 | 97000 | 2.0921 |
|
245 |
+
| 0.1408 | 7.6 | 97500 | 2.1379 |
|
246 |
+
| 0.1367 | 7.64 | 98000 | 2.0763 |
|
247 |
+
| 0.1487 | 7.68 | 98500 | 2.2021 |
|
248 |
+
| 0.1657 | 7.71 | 99000 | 2.0800 |
|
249 |
+
| 0.1408 | 7.75 | 99500 | 2.1433 |
|
250 |
+
| 0.1328 | 7.79 | 100000 | 2.0924 |
|
251 |
+
| 0.1485 | 7.83 | 100500 | 2.1479 |
|
252 |
+
| 0.1546 | 7.87 | 101000 | 2.0750 |
|
253 |
+
| 0.1501 | 7.91 | 101500 | 2.0885 |
|
254 |
+
| 0.1391 | 7.95 | 102000 | 2.1003 |
|
255 |
+
| 0.173 | 7.99 | 102500 | 1.9603 |
|
256 |
+
| 0.096 | 8.03 | 103000 | 2.2128 |
|
257 |
+
| 0.0967 | 8.07 | 103500 | 2.2105 |
|
258 |
+
| 0.0909 | 8.1 | 104000 | 2.2345 |
|
259 |
+
| 0.086 | 8.14 | 104500 | 2.3129 |
|
260 |
+
| 0.1052 | 8.18 | 105000 | 2.3452 |
|
261 |
+
| 0.0975 | 8.22 | 105500 | 2.3279 |
|
262 |
+
| 0.0875 | 8.26 | 106000 | 2.3719 |
|
263 |
+
| 0.1167 | 8.3 | 106500 | 2.2740 |
|
264 |
+
| 0.0724 | 8.34 | 107000 | 2.3902 |
|
265 |
+
| 0.1067 | 8.38 | 107500 | 2.3961 |
|
266 |
+
| 0.1017 | 8.42 | 108000 | 2.2360 |
|
267 |
+
| 0.1003 | 8.45 | 108500 | 2.2271 |
|
268 |
+
| 0.1113 | 8.49 | 109000 | 2.3305 |
|
269 |
+
| 0.113 | 8.53 | 109500 | 2.2344 |
|
270 |
+
| 0.1047 | 8.57 | 110000 | 2.2780 |
|
271 |
+
| 0.0935 | 8.61 | 110500 | 2.3290 |
|
272 |
+
| 0.1159 | 8.65 | 111000 | 2.3176 |
|
273 |
+
| 0.0936 | 8.69 | 111500 | 2.3421 |
|
274 |
+
| 0.0954 | 8.73 | 112000 | 2.2757 |
|
275 |
+
| 0.1131 | 8.77 | 112500 | 2.2388 |
|
276 |
+
| 0.0939 | 8.81 | 113000 | 2.3273 |
|
277 |
+
| 0.1026 | 8.84 | 113500 | 2.2831 |
|
278 |
+
| 0.0842 | 8.88 | 114000 | 2.3705 |
|
279 |
+
| 0.1031 | 8.92 | 114500 | 2.3365 |
|
280 |
+
| 0.1114 | 8.96 | 115000 | 2.2940 |
|
281 |
+
| 0.1145 | 9.0 | 115500 | 2.2661 |
|
282 |
+
| 0.0796 | 9.04 | 116000 | 2.3672 |
|
283 |
+
| 0.0591 | 9.08 | 116500 | 2.5256 |
|
284 |
+
| 0.0724 | 9.12 | 117000 | 2.4654 |
|
285 |
+
| 0.0733 | 9.16 | 117500 | 2.4303 |
|
286 |
+
| 0.0727 | 9.2 | 118000 | 2.5239 |
|
287 |
+
| 0.0649 | 9.23 | 118500 | 2.4831 |
|
288 |
+
| 0.0874 | 9.27 | 119000 | 2.4823 |
|
289 |
+
| 0.0679 | 9.31 | 119500 | 2.5225 |
|
290 |
+
| 0.0798 | 9.35 | 120000 | 2.4684 |
|
291 |
+
| 0.0774 | 9.39 | 120500 | 2.4247 |
|
292 |
+
| 0.0837 | 9.43 | 121000 | 2.3901 |
|
293 |
+
| 0.077 | 9.47 | 121500 | 2.4002 |
|
294 |
+
| 0.0591 | 9.51 | 122000 | 2.4534 |
|
295 |
+
| 0.0598 | 9.55 | 122500 | 2.4878 |
|
296 |
+
| 0.0662 | 9.58 | 123000 | 2.5026 |
|
297 |
+
| 0.0716 | 9.62 | 123500 | 2.4876 |
|
298 |
+
| 0.0744 | 9.66 | 124000 | 2.4856 |
|
299 |
+
| 0.0759 | 9.7 | 124500 | 2.4703 |
|
300 |
+
| 0.0713 | 9.74 | 125000 | 2.4614 |
|
301 |
+
| 0.0687 | 9.78 | 125500 | 2.4629 |
|
302 |
+
| 0.0706 | 9.82 | 126000 | 2.4621 |
|
303 |
+
| 0.0702 | 9.86 | 126500 | 2.4521 |
|
304 |
+
| 0.0609 | 9.9 | 127000 | 2.4698 |
|
305 |
+
| 0.0782 | 9.94 | 127500 | 2.4702 |
|
306 |
+
| 0.062 | 9.97 | 128000 | 2.4713 |
|
307 |
|
308 |
|
309 |
### Framework versions
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1112905680
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:200eb62b2ddf181030616f3904b90b7f2a09a8e65b4e7bce28b747712fc0e5a1
|
3 |
size 1112905680
|
runs/Nov27_22-31-52_Software-AI/events.out.tfevents.1701111713.Software-AI.10944.1
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f7c996d3a97b3eb23cddaa8a6b655b2f24c04c3cbd4fba0b0fc86494cfcdbfc7
|
3 |
+
size 116309
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4219
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ff320f9c9d6b606d08c02765fa1f3d1f5688dcbe354925aa77f75fb3f9795ed5
|
3 |
size 4219
|