End of training
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
17 |
|
18 |
This model is a fine-tuned version of [makhataei/qa-persian-mdeberta-v3-base-squad2](https://huggingface.co/makhataei/qa-persian-mdeberta-v3-base-squad2) on the pquad dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
-
- Loss: 2.
|
21 |
|
22 |
## Model description
|
23 |
|
@@ -36,7 +36,7 @@ More information needed
|
|
36 |
### Training hyperparameters
|
37 |
|
38 |
The following hyperparameters were used during training:
|
39 |
-
- learning_rate:
|
40 |
- train_batch_size: 5
|
41 |
- eval_batch_size: 5
|
42 |
- seed: 42
|
@@ -48,262 +48,262 @@ The following hyperparameters were used during training:
|
|
48 |
|
49 |
| Training Loss | Epoch | Step | Validation Loss |
|
50 |
|:-------------:|:-----:|:------:|:---------------:|
|
51 |
-
| 0.
|
52 |
-
| 0.
|
53 |
-
| 0.
|
54 |
-
| 0.
|
55 |
-
| 0.
|
56 |
-
| 0.
|
57 |
-
| 0.
|
58 |
-
| 0.
|
59 |
-
| 0.
|
60 |
-
| 0.
|
61 |
-
| 0.
|
62 |
-
| 0.
|
63 |
-
| 0.
|
64 |
-
| 0.
|
65 |
-
| 0.
|
66 |
-
| 0.
|
67 |
-
| 0.
|
68 |
-
| 0.
|
69 |
-
| 0.
|
70 |
-
| 0.
|
71 |
-
| 0.
|
72 |
-
| 0.
|
73 |
-
| 0.
|
74 |
-
| 0.
|
75 |
-
| 0.
|
76 |
-
| 0.
|
77 |
-
| 0.
|
78 |
-
| 0.
|
79 |
-
| 0.
|
80 |
-
| 0.
|
81 |
-
| 0.
|
82 |
-
| 0.
|
83 |
-
| 0.
|
84 |
-
| 0.
|
85 |
-
| 0.
|
86 |
-
| 0.
|
87 |
-
| 0.
|
88 |
-
| 0.
|
89 |
-
| 0.
|
90 |
-
| 0.
|
91 |
-
| 0.
|
92 |
-
| 0.
|
93 |
-
| 0.
|
94 |
-
| 0.
|
95 |
-
| 0.
|
96 |
-
| 0.
|
97 |
-
| 0.
|
98 |
-
| 0.
|
99 |
-
| 0.
|
100 |
-
| 0.
|
101 |
-
| 0.
|
102 |
-
| 0.
|
103 |
-
| 0.
|
104 |
-
| 0.
|
105 |
-
| 0.
|
106 |
-
| 0.
|
107 |
-
| 0.
|
108 |
-
| 0.
|
109 |
-
| 0.
|
110 |
-
| 0.
|
111 |
-
| 0.
|
112 |
-
| 0.
|
113 |
-
| 0.
|
114 |
-
| 0.
|
115 |
-
| 0.
|
116 |
-
| 0.
|
117 |
-
| 0.
|
118 |
-
| 0.
|
119 |
-
| 0.
|
120 |
-
| 0.
|
121 |
-
| 0.
|
122 |
-
| 0.
|
123 |
-
| 0.
|
124 |
-
| 0.
|
125 |
-
| 0.
|
126 |
-
| 0.
|
127 |
-
| 0.
|
128 |
-
| 0.
|
129 |
-
| 0.
|
130 |
-
| 0.
|
131 |
-
| 0.
|
132 |
-
| 0.
|
133 |
-
| 0.
|
134 |
-
| 0.
|
135 |
-
| 0.
|
136 |
-
| 0.
|
137 |
-
| 0.
|
138 |
-
| 0.
|
139 |
-
| 0.
|
140 |
-
| 0.
|
141 |
-
| 0.
|
142 |
-
| 0.
|
143 |
-
| 0.
|
144 |
-
| 0.
|
145 |
-
| 0.
|
146 |
-
| 0.
|
147 |
-
| 0.
|
148 |
-
| 0.
|
149 |
-
| 0.
|
150 |
-
| 0.
|
151 |
-
| 0.
|
152 |
-
| 0.
|
153 |
-
| 0.
|
154 |
-
| 0.
|
155 |
-
| 0.
|
156 |
-
| 0.
|
157 |
-
| 0.
|
158 |
-
| 0.
|
159 |
-
| 0.
|
160 |
-
| 0.
|
161 |
-
| 0.
|
162 |
-
| 0.
|
163 |
-
| 0.
|
164 |
-
| 0.
|
165 |
-
| 0.
|
166 |
-
| 0.
|
167 |
-
| 0.
|
168 |
-
| 0.
|
169 |
-
| 0.
|
170 |
-
| 0.
|
171 |
-
| 0.
|
172 |
-
| 0.
|
173 |
-
| 0.
|
174 |
-
| 0.
|
175 |
-
| 0.
|
176 |
-
| 0.
|
177 |
-
| 0.
|
178 |
-
| 0.
|
179 |
-
| 0.
|
180 |
-
| 0.
|
181 |
-
| 0.
|
182 |
-
| 0.
|
183 |
-
| 0.
|
184 |
-
| 0.
|
185 |
-
| 0.
|
186 |
-
| 0.
|
187 |
-
| 0.
|
188 |
-
| 0.
|
189 |
-
| 0.
|
190 |
-
| 0.
|
191 |
-
| 0.
|
192 |
-
| 0.
|
193 |
-
| 0.
|
194 |
-
| 0.
|
195 |
-
| 0.
|
196 |
-
| 0.
|
197 |
-
| 0.
|
198 |
-
| 0.183 | 5.77 | 74000 |
|
199 |
-
| 0.
|
200 |
-
| 0.
|
201 |
-
| 0.
|
202 |
-
| 0.
|
203 |
-
| 0.
|
204 |
-
| 0.
|
205 |
-
| 0.
|
206 |
-
| 0.
|
207 |
-
| 0.
|
208 |
-
| 0.
|
209 |
-
| 0.
|
210 |
-
| 0.
|
211 |
-
| 0.
|
212 |
-
| 0.
|
213 |
-
| 0.
|
214 |
-
| 0.
|
215 |
-
| 0.
|
216 |
-
| 0.
|
217 |
-
| 0.
|
218 |
-
| 0.
|
219 |
-
| 0.
|
220 |
-
| 0.
|
221 |
-
| 0.
|
222 |
-
| 0.
|
223 |
-
| 0.
|
224 |
-
| 0.
|
225 |
-
| 0.
|
226 |
-
| 0.
|
227 |
-
| 0.
|
228 |
-
| 0.
|
229 |
-
| 0.
|
230 |
-
| 0.
|
231 |
-
| 0.
|
232 |
-
| 0.
|
233 |
-
| 0.
|
234 |
-
| 0.
|
235 |
-
| 0.
|
236 |
-
| 0.
|
237 |
-
| 0.
|
238 |
-
| 0.
|
239 |
-
| 0.
|
240 |
-
| 0.
|
241 |
-
| 0.
|
242 |
-
| 0.
|
243 |
-
| 0.
|
244 |
-
| 0.
|
245 |
-
| 0.
|
246 |
-
| 0.
|
247 |
-
| 0.
|
248 |
-
| 0.
|
249 |
-
| 0.
|
250 |
-
| 0.
|
251 |
-
| 0.
|
252 |
-
| 0.
|
253 |
-
| 0.
|
254 |
-
| 0.
|
255 |
-
| 0.
|
256 |
-
| 0.
|
257 |
-
| 0.
|
258 |
-
| 0.
|
259 |
-
| 0.
|
260 |
-
| 0.
|
261 |
-
| 0.
|
262 |
-
| 0.
|
263 |
-
| 0.
|
264 |
-
| 0.
|
265 |
-
| 0.
|
266 |
-
| 0.
|
267 |
-
| 0.
|
268 |
-
| 0.
|
269 |
-
| 0.
|
270 |
-
| 0.
|
271 |
-
| 0.
|
272 |
-
| 0.
|
273 |
-
| 0.
|
274 |
-
| 0.
|
275 |
-
| 0.
|
276 |
-
| 0.
|
277 |
-
| 0.
|
278 |
-
| 0.
|
279 |
-
| 0.
|
280 |
-
| 0.
|
281 |
-
| 0.
|
282 |
-
| 0.
|
283 |
-
| 0.
|
284 |
-
| 0.
|
285 |
-
| 0.
|
286 |
-
| 0.
|
287 |
-
| 0.
|
288 |
-
| 0.
|
289 |
-
| 0.
|
290 |
-
| 0.
|
291 |
-
| 0.
|
292 |
-
| 0.
|
293 |
-
| 0.
|
294 |
-
| 0.
|
295 |
-
| 0.
|
296 |
-
| 0.
|
297 |
-
| 0.
|
298 |
-
| 0.
|
299 |
-
| 0.
|
300 |
-
| 0.
|
301 |
-
| 0.
|
302 |
-
| 0.
|
303 |
-
| 0.
|
304 |
-
| 0.
|
305 |
-
| 0.
|
306 |
-
| 0.
|
307 |
|
308 |
|
309 |
### Framework versions
|
|
|
17 |
|
18 |
This model is a fine-tuned version of [makhataei/qa-persian-mdeberta-v3-base-squad2](https://huggingface.co/makhataei/qa-persian-mdeberta-v3-base-squad2) on the pquad dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
+
- Loss: 2.2905
|
21 |
|
22 |
## Model description
|
23 |
|
|
|
36 |
### Training hyperparameters
|
37 |
|
38 |
The following hyperparameters were used during training:
|
39 |
+
- learning_rate: 1.25e-05
|
40 |
- train_batch_size: 5
|
41 |
- eval_batch_size: 5
|
42 |
- seed: 42
|
|
|
48 |
|
49 |
| Training Loss | Epoch | Step | Validation Loss |
|
50 |
|:-------------:|:-----:|:------:|:---------------:|
|
51 |
+
| 0.2324 | 0.04 | 500 | 1.4461 |
|
52 |
+
| 0.2076 | 0.08 | 1000 | 1.5599 |
|
53 |
+
| 0.2298 | 0.12 | 1500 | 1.6634 |
|
54 |
+
| 0.2049 | 0.16 | 2000 | 1.7076 |
|
55 |
+
| 0.201 | 0.19 | 2500 | 1.7011 |
|
56 |
+
| 0.1981 | 0.23 | 3000 | 1.6738 |
|
57 |
+
| 0.1588 | 0.27 | 3500 | 1.7657 |
|
58 |
+
| 0.1836 | 0.31 | 4000 | 1.7728 |
|
59 |
+
| 0.1958 | 0.35 | 4500 | 1.6861 |
|
60 |
+
| 0.162 | 0.39 | 5000 | 1.7768 |
|
61 |
+
| 0.1811 | 0.43 | 5500 | 1.7534 |
|
62 |
+
| 0.1775 | 0.47 | 6000 | 1.7344 |
|
63 |
+
| 0.1806 | 0.51 | 6500 | 1.7266 |
|
64 |
+
| 0.1566 | 0.55 | 7000 | 1.8093 |
|
65 |
+
| 0.1517 | 0.58 | 7500 | 1.7544 |
|
66 |
+
| 0.1146 | 0.62 | 8000 | 1.9351 |
|
67 |
+
| 0.154 | 0.66 | 8500 | 1.8271 |
|
68 |
+
| 0.323 | 0.7 | 9000 | 1.4894 |
|
69 |
+
| 0.2732 | 0.74 | 9500 | 1.4975 |
|
70 |
+
| 0.2902 | 0.78 | 10000 | 1.5645 |
|
71 |
+
| 0.2561 | 0.82 | 10500 | 1.5566 |
|
72 |
+
| 0.2754 | 0.86 | 11000 | 1.4860 |
|
73 |
+
| 0.5959 | 0.9 | 11500 | 1.1121 |
|
74 |
+
| 0.5385 | 0.94 | 12000 | 1.1161 |
|
75 |
+
| 0.5452 | 0.97 | 12500 | 1.0867 |
|
76 |
+
| 0.4369 | 1.01 | 13000 | 1.2922 |
|
77 |
+
| 0.3144 | 1.05 | 13500 | 1.3008 |
|
78 |
+
| 0.3284 | 1.09 | 14000 | 1.4088 |
|
79 |
+
| 0.292 | 1.13 | 14500 | 1.4120 |
|
80 |
+
| 0.3237 | 1.17 | 15000 | 1.3833 |
|
81 |
+
| 0.3077 | 1.21 | 15500 | 1.3974 |
|
82 |
+
| 0.3051 | 1.25 | 16000 | 1.5286 |
|
83 |
+
| 0.3015 | 1.29 | 16500 | 1.4756 |
|
84 |
+
| 0.3496 | 1.32 | 17000 | 1.4013 |
|
85 |
+
| 0.3178 | 1.36 | 17500 | 1.3949 |
|
86 |
+
| 0.3188 | 1.4 | 18000 | 1.3854 |
|
87 |
+
| 0.3176 | 1.44 | 18500 | 1.4037 |
|
88 |
+
| 0.3291 | 1.48 | 19000 | 1.3074 |
|
89 |
+
| 0.3241 | 1.52 | 19500 | 1.4160 |
|
90 |
+
| 0.3164 | 1.56 | 20000 | 1.4171 |
|
91 |
+
| 0.3118 | 1.6 | 20500 | 1.4151 |
|
92 |
+
| 0.3429 | 1.64 | 21000 | 1.4271 |
|
93 |
+
| 0.2833 | 1.68 | 21500 | 1.4760 |
|
94 |
+
| 0.3184 | 1.71 | 22000 | 1.3960 |
|
95 |
+
| 0.2887 | 1.75 | 22500 | 1.4839 |
|
96 |
+
| 0.31 | 1.79 | 23000 | 1.4136 |
|
97 |
+
| 0.3282 | 1.83 | 23500 | 1.3990 |
|
98 |
+
| 0.3153 | 1.87 | 24000 | 1.4032 |
|
99 |
+
| 0.2832 | 1.91 | 24500 | 1.4633 |
|
100 |
+
| 0.3439 | 1.95 | 25000 | 1.3783 |
|
101 |
+
| 0.3133 | 1.99 | 25500 | 1.4371 |
|
102 |
+
| 0.2562 | 2.03 | 26000 | 1.5103 |
|
103 |
+
| 0.2338 | 2.06 | 26500 | 1.6106 |
|
104 |
+
| 0.2464 | 2.1 | 27000 | 1.6430 |
|
105 |
+
| 0.2187 | 2.14 | 27500 | 1.6828 |
|
106 |
+
| 0.2353 | 2.18 | 28000 | 1.6362 |
|
107 |
+
| 0.2726 | 2.22 | 28500 | 1.5727 |
|
108 |
+
| 0.2491 | 2.26 | 29000 | 1.5545 |
|
109 |
+
| 0.2743 | 2.3 | 29500 | 1.5949 |
|
110 |
+
| 0.2419 | 2.34 | 30000 | 1.6422 |
|
111 |
+
| 0.2661 | 2.38 | 30500 | 1.5882 |
|
112 |
+
| 0.2105 | 2.42 | 31000 | 1.6584 |
|
113 |
+
| 0.2323 | 2.45 | 31500 | 1.6550 |
|
114 |
+
| 0.2778 | 2.49 | 32000 | 1.5761 |
|
115 |
+
| 0.2411 | 2.53 | 32500 | 1.6776 |
|
116 |
+
| 0.2552 | 2.57 | 33000 | 1.6707 |
|
117 |
+
| 0.2468 | 2.61 | 33500 | 1.5738 |
|
118 |
+
| 0.2398 | 2.65 | 34000 | 1.6479 |
|
119 |
+
| 0.2318 | 2.69 | 34500 | 1.6217 |
|
120 |
+
| 0.2093 | 2.73 | 35000 | 1.7018 |
|
121 |
+
| 0.2344 | 2.77 | 35500 | 1.6763 |
|
122 |
+
| 0.2243 | 2.81 | 36000 | 1.6870 |
|
123 |
+
| 0.1943 | 2.84 | 36500 | 1.6926 |
|
124 |
+
| 0.221 | 2.88 | 37000 | 1.6862 |
|
125 |
+
| 0.2256 | 2.92 | 37500 | 1.7141 |
|
126 |
+
| 0.3765 | 2.96 | 38000 | 1.5414 |
|
127 |
+
| 0.3601 | 3.0 | 38500 | 1.4698 |
|
128 |
+
| 0.2237 | 3.04 | 39000 | 1.7001 |
|
129 |
+
| 0.2426 | 3.08 | 39500 | 1.6693 |
|
130 |
+
| 0.2216 | 3.12 | 40000 | 1.7385 |
|
131 |
+
| 0.2417 | 3.16 | 40500 | 1.6941 |
|
132 |
+
| 0.2604 | 3.19 | 41000 | 1.6964 |
|
133 |
+
| 0.2762 | 3.23 | 41500 | 1.6379 |
|
134 |
+
| 0.2399 | 3.27 | 42000 | 1.6806 |
|
135 |
+
| 0.2249 | 3.31 | 42500 | 1.7414 |
|
136 |
+
| 0.2582 | 3.35 | 43000 | 1.6874 |
|
137 |
+
| 0.2524 | 3.39 | 43500 | 1.6648 |
|
138 |
+
| 0.2359 | 3.43 | 44000 | 1.7382 |
|
139 |
+
| 0.2729 | 3.47 | 44500 | 1.6762 |
|
140 |
+
| 0.2729 | 3.51 | 45000 | 1.6736 |
|
141 |
+
| 0.2478 | 3.55 | 45500 | 1.7487 |
|
142 |
+
| 0.2557 | 3.58 | 46000 | 1.6379 |
|
143 |
+
| 0.2486 | 3.62 | 46500 | 1.6746 |
|
144 |
+
| 0.2541 | 3.66 | 47000 | 1.6942 |
|
145 |
+
| 0.2613 | 3.7 | 47500 | 1.6501 |
|
146 |
+
| 0.2552 | 3.74 | 48000 | 1.6790 |
|
147 |
+
| 0.2692 | 3.78 | 48500 | 1.6246 |
|
148 |
+
| 0.2769 | 3.82 | 49000 | 1.6306 |
|
149 |
+
| 0.2542 | 3.86 | 49500 | 1.6412 |
|
150 |
+
| 0.2477 | 3.9 | 50000 | 1.6786 |
|
151 |
+
| 0.2686 | 3.94 | 50500 | 1.6677 |
|
152 |
+
| 0.2324 | 3.97 | 51000 | 1.7063 |
|
153 |
+
| 0.2509 | 4.01 | 51500 | 1.6490 |
|
154 |
+
| 0.1966 | 4.05 | 52000 | 1.8161 |
|
155 |
+
| 0.227 | 4.09 | 52500 | 1.7389 |
|
156 |
+
| 0.1881 | 4.13 | 53000 | 1.8164 |
|
157 |
+
| 0.2244 | 4.17 | 53500 | 1.7851 |
|
158 |
+
| 0.2068 | 4.21 | 54000 | 1.8039 |
|
159 |
+
| 0.2094 | 4.25 | 54500 | 1.8641 |
|
160 |
+
| 0.1783 | 4.29 | 55000 | 1.8781 |
|
161 |
+
| 0.1916 | 4.32 | 55500 | 1.8887 |
|
162 |
+
| 0.2221 | 4.36 | 56000 | 1.8061 |
|
163 |
+
| 0.2238 | 4.4 | 56500 | 1.7892 |
|
164 |
+
| 0.1996 | 4.44 | 57000 | 1.8320 |
|
165 |
+
| 0.2074 | 4.48 | 57500 | 1.8944 |
|
166 |
+
| 0.2401 | 4.52 | 58000 | 1.7803 |
|
167 |
+
| 0.2174 | 4.56 | 58500 | 1.8466 |
|
168 |
+
| 0.2258 | 4.6 | 59000 | 1.8607 |
|
169 |
+
| 0.223 | 4.64 | 59500 | 1.7695 |
|
170 |
+
| 0.185 | 4.68 | 60000 | 1.8845 |
|
171 |
+
| 0.2464 | 4.71 | 60500 | 1.8049 |
|
172 |
+
| 0.2223 | 4.75 | 61000 | 1.8136 |
|
173 |
+
| 0.2192 | 4.79 | 61500 | 1.7870 |
|
174 |
+
| 0.2191 | 4.83 | 62000 | 1.7845 |
|
175 |
+
| 0.2471 | 4.87 | 62500 | 1.7158 |
|
176 |
+
| 0.2085 | 4.91 | 63000 | 1.7816 |
|
177 |
+
| 0.2316 | 4.95 | 63500 | 1.7406 |
|
178 |
+
| 0.2449 | 4.99 | 64000 | 1.7465 |
|
179 |
+
| 0.196 | 5.03 | 64500 | 1.8431 |
|
180 |
+
| 0.1851 | 5.07 | 65000 | 1.8751 |
|
181 |
+
| 0.1393 | 5.1 | 65500 | 1.9697 |
|
182 |
+
| 0.1752 | 5.14 | 66000 | 1.9985 |
|
183 |
+
| 0.1438 | 5.18 | 66500 | 2.0071 |
|
184 |
+
| 0.2112 | 5.22 | 67000 | 1.9434 |
|
185 |
+
| 0.1715 | 5.26 | 67500 | 1.9735 |
|
186 |
+
| 0.1982 | 5.3 | 68000 | 1.9319 |
|
187 |
+
| 0.1768 | 5.34 | 68500 | 1.9622 |
|
188 |
+
| 0.1872 | 5.38 | 69000 | 1.8810 |
|
189 |
+
| 0.2059 | 5.42 | 69500 | 1.8445 |
|
190 |
+
| 0.1903 | 5.45 | 70000 | 1.8744 |
|
191 |
+
| 0.1835 | 5.49 | 70500 | 1.9283 |
|
192 |
+
| 0.1843 | 5.53 | 71000 | 1.9938 |
|
193 |
+
| 0.1727 | 5.57 | 71500 | 1.9865 |
|
194 |
+
| 0.1994 | 5.61 | 72000 | 1.9390 |
|
195 |
+
| 0.172 | 5.65 | 72500 | 2.0077 |
|
196 |
+
| 0.163 | 5.69 | 73000 | 1.9794 |
|
197 |
+
| 0.196 | 5.73 | 73500 | 1.9307 |
|
198 |
+
| 0.183 | 5.77 | 74000 | 1.9463 |
|
199 |
+
| 0.1764 | 5.81 | 74500 | 1.9981 |
|
200 |
+
| 0.1951 | 5.84 | 75000 | 1.9378 |
|
201 |
+
| 0.2014 | 5.88 | 75500 | 1.9199 |
|
202 |
+
| 0.1766 | 5.92 | 76000 | 1.9824 |
|
203 |
+
| 0.1996 | 5.96 | 76500 | 1.9309 |
|
204 |
+
| 0.1919 | 6.0 | 77000 | 1.9458 |
|
205 |
+
| 0.1664 | 6.04 | 77500 | 2.0603 |
|
206 |
+
| 0.1517 | 6.08 | 78000 | 2.0253 |
|
207 |
+
| 0.1461 | 6.12 | 78500 | 2.1310 |
|
208 |
+
| 0.1379 | 6.16 | 79000 | 2.1506 |
|
209 |
+
| 0.1532 | 6.19 | 79500 | 2.0715 |
|
210 |
+
| 0.1546 | 6.23 | 80000 | 2.1345 |
|
211 |
+
| 0.156 | 6.27 | 80500 | 2.1732 |
|
212 |
+
| 0.1648 | 6.31 | 81000 | 2.1075 |
|
213 |
+
| 0.1494 | 6.35 | 81500 | 2.1547 |
|
214 |
+
| 0.1741 | 6.39 | 82000 | 2.0228 |
|
215 |
+
| 0.1391 | 6.43 | 82500 | 2.0426 |
|
216 |
+
| 0.1541 | 6.47 | 83000 | 2.0919 |
|
217 |
+
| 0.1609 | 6.51 | 83500 | 2.1206 |
|
218 |
+
| 0.159 | 6.55 | 84000 | 2.0798 |
|
219 |
+
| 0.153 | 6.58 | 84500 | 2.1216 |
|
220 |
+
| 0.1822 | 6.62 | 85000 | 2.1276 |
|
221 |
+
| 0.1466 | 6.66 | 85500 | 2.1533 |
|
222 |
+
| 0.1583 | 6.7 | 86000 | 2.1250 |
|
223 |
+
| 0.2012 | 6.74 | 86500 | 2.0619 |
|
224 |
+
| 0.1501 | 6.78 | 87000 | 2.0804 |
|
225 |
+
| 0.1748 | 6.82 | 87500 | 2.0684 |
|
226 |
+
| 0.1571 | 6.86 | 88000 | 2.0902 |
|
227 |
+
| 0.169 | 6.9 | 88500 | 2.0587 |
|
228 |
+
| 0.183 | 6.94 | 89000 | 2.0435 |
|
229 |
+
| 0.1891 | 6.97 | 89500 | 1.9954 |
|
230 |
+
| 0.1647 | 7.01 | 90000 | 2.0333 |
|
231 |
+
| 0.1511 | 7.05 | 90500 | 2.0657 |
|
232 |
+
| 0.1345 | 7.09 | 91000 | 2.1329 |
|
233 |
+
| 0.1394 | 7.13 | 91500 | 2.1481 |
|
234 |
+
| 0.133 | 7.17 | 92000 | 2.1518 |
|
235 |
+
| 0.1508 | 7.21 | 92500 | 2.1051 |
|
236 |
+
| 0.1493 | 7.25 | 93000 | 2.1017 |
|
237 |
+
| 0.148 | 7.29 | 93500 | 2.0833 |
|
238 |
+
| 0.1416 | 7.32 | 94000 | 2.1286 |
|
239 |
+
| 0.1185 | 7.36 | 94500 | 2.1419 |
|
240 |
+
| 0.1274 | 7.4 | 95000 | 2.1302 |
|
241 |
+
| 0.1326 | 7.44 | 95500 | 2.1720 |
|
242 |
+
| 0.1378 | 7.48 | 96000 | 2.1826 |
|
243 |
+
| 0.1117 | 7.52 | 96500 | 2.2190 |
|
244 |
+
| 0.1454 | 7.56 | 97000 | 2.1884 |
|
245 |
+
| 0.1288 | 7.6 | 97500 | 2.2184 |
|
246 |
+
| 0.1605 | 7.64 | 98000 | 2.1831 |
|
247 |
+
| 0.1492 | 7.68 | 98500 | 2.1518 |
|
248 |
+
| 0.1573 | 7.71 | 99000 | 2.1452 |
|
249 |
+
| 0.1496 | 7.75 | 99500 | 2.1474 |
|
250 |
+
| 0.1382 | 7.79 | 100000 | 2.1298 |
|
251 |
+
| 0.1368 | 7.83 | 100500 | 2.1231 |
|
252 |
+
| 0.1699 | 7.87 | 101000 | 2.0813 |
|
253 |
+
| 0.153 | 7.91 | 101500 | 2.1481 |
|
254 |
+
| 0.1412 | 7.95 | 102000 | 2.1022 |
|
255 |
+
| 0.1663 | 7.99 | 102500 | 2.0768 |
|
256 |
+
| 0.1055 | 8.03 | 103000 | 2.1489 |
|
257 |
+
| 0.1165 | 8.07 | 103500 | 2.1983 |
|
258 |
+
| 0.1165 | 8.1 | 104000 | 2.2075 |
|
259 |
+
| 0.1172 | 8.14 | 104500 | 2.1885 |
|
260 |
+
| 0.1222 | 8.18 | 105000 | 2.1968 |
|
261 |
+
| 0.1407 | 8.22 | 105500 | 2.2263 |
|
262 |
+
| 0.1048 | 8.26 | 106000 | 2.2442 |
|
263 |
+
| 0.1293 | 8.3 | 106500 | 2.2103 |
|
264 |
+
| 0.0964 | 8.34 | 107000 | 2.2572 |
|
265 |
+
| 0.1516 | 8.38 | 107500 | 2.2265 |
|
266 |
+
| 0.1415 | 8.42 | 108000 | 2.2039 |
|
267 |
+
| 0.1135 | 8.45 | 108500 | 2.2160 |
|
268 |
+
| 0.1431 | 8.49 | 109000 | 2.2018 |
|
269 |
+
| 0.1161 | 8.53 | 109500 | 2.2555 |
|
270 |
+
| 0.1705 | 8.57 | 110000 | 2.2277 |
|
271 |
+
| 0.1299 | 8.61 | 110500 | 2.2269 |
|
272 |
+
| 0.1354 | 8.65 | 111000 | 2.1957 |
|
273 |
+
| 0.0906 | 8.69 | 111500 | 2.2220 |
|
274 |
+
| 0.1186 | 8.73 | 112000 | 2.2277 |
|
275 |
+
| 0.1482 | 8.77 | 112500 | 2.1811 |
|
276 |
+
| 0.1628 | 8.81 | 113000 | 2.1620 |
|
277 |
+
| 0.1141 | 8.84 | 113500 | 2.1916 |
|
278 |
+
| 0.0998 | 8.88 | 114000 | 2.2243 |
|
279 |
+
| 0.1227 | 8.92 | 114500 | 2.2303 |
|
280 |
+
| 0.1434 | 8.96 | 115000 | 2.2154 |
|
281 |
+
| 0.1358 | 9.0 | 115500 | 2.1964 |
|
282 |
+
| 0.1263 | 9.04 | 116000 | 2.2122 |
|
283 |
+
| 0.0955 | 9.08 | 116500 | 2.2367 |
|
284 |
+
| 0.1016 | 9.12 | 117000 | 2.2425 |
|
285 |
+
| 0.1106 | 9.16 | 117500 | 2.2399 |
|
286 |
+
| 0.1081 | 9.2 | 118000 | 2.2621 |
|
287 |
+
| 0.1318 | 9.23 | 118500 | 2.2402 |
|
288 |
+
| 0.1174 | 9.27 | 119000 | 2.2364 |
|
289 |
+
| 0.1071 | 9.31 | 119500 | 2.2163 |
|
290 |
+
| 0.1049 | 9.35 | 120000 | 2.2512 |
|
291 |
+
| 0.1289 | 9.39 | 120500 | 2.2354 |
|
292 |
+
| 0.1214 | 9.43 | 121000 | 2.2384 |
|
293 |
+
| 0.1149 | 9.47 | 121500 | 2.2346 |
|
294 |
+
| 0.0977 | 9.51 | 122000 | 2.2553 |
|
295 |
+
| 0.1088 | 9.55 | 122500 | 2.2676 |
|
296 |
+
| 0.101 | 9.58 | 123000 | 2.2732 |
|
297 |
+
| 0.1135 | 9.62 | 123500 | 2.2706 |
|
298 |
+
| 0.1168 | 9.66 | 124000 | 2.2768 |
|
299 |
+
| 0.1164 | 9.7 | 124500 | 2.2803 |
|
300 |
+
| 0.113 | 9.74 | 125000 | 2.2813 |
|
301 |
+
| 0.0944 | 9.78 | 125500 | 2.2862 |
|
302 |
+
| 0.1189 | 9.82 | 126000 | 2.2904 |
|
303 |
+
| 0.1059 | 9.86 | 126500 | 2.2905 |
|
304 |
+
| 0.1108 | 9.9 | 127000 | 2.2920 |
|
305 |
+
| 0.1195 | 9.94 | 127500 | 2.2911 |
|
306 |
+
| 0.1009 | 9.97 | 128000 | 2.2905 |
|
307 |
|
308 |
|
309 |
### Framework versions
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1112905680
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a05f9de5d189d521a82802c060674369ad8aa361f51de2c94b39a58e5c15e01a
|
3 |
size 1112905680
|
runs/Nov28_23-41-44_Software-AI/events.out.tfevents.1701202304.Software-AI.10944.3
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0cf273f9f0d3ba024856d0271d623219f296b22e60da44463244f3397b625264
|
3 |
+
size 116312
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4219
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e4d30d6acbd5342517077482308f2a5bc4430411e09aa03161dcf0bb68f4f7aa
|
3 |
size 4219
|