--- tags: - generated_from_trainer model-index: - name: AraT5v2-base-1024-p-l-akk-en-20240811-231511 results: [] --- # AraT5v2-base-1024-p-l-akk-en-20240811-231511 This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.4597 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 4e-05 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:------:|:---------------:| | 0.364 | 0.0552 | 2500 | 0.4332 | | 0.3744 | 0.1105 | 5000 | 0.4408 | | 0.3825 | 0.1657 | 7500 | 0.4430 | | 0.384 | 0.2210 | 10000 | 0.4377 | | 0.3751 | 0.2762 | 12500 | 0.4421 | | 0.4055 | 0.3314 | 15000 | 0.4372 | | 0.355 | 0.3867 | 17500 | 0.4352 | | 0.3871 | 0.4419 | 20000 | 0.4376 | | 0.4159 | 0.4972 | 22500 | 0.4335 | | 0.3782 | 0.5524 | 25000 | 0.4295 | | 0.384 | 0.6077 | 27500 | 0.4305 | | 0.3782 | 0.6629 | 30000 | 0.4311 | | 0.3708 | 0.7181 | 32500 | 0.4332 | | 0.3809 | 0.7734 | 35000 | 0.4263 | | 0.3964 | 0.8286 | 37500 | 0.4280 | | 0.3832 | 0.8839 | 40000 | 0.4253 | | 0.4052 | 0.9391 | 42500 | 0.4320 | | 0.4015 | 0.9943 | 45000 | 0.4261 | | 0.352 | 1.0496 | 47500 | 0.4307 | | 0.3456 | 1.1048 | 50000 | 0.4318 | | 0.3726 | 1.1601 | 52500 | 0.4366 | | 0.323 | 1.2153 | 55000 | 0.4357 | | 0.3565 | 1.2705 | 57500 | 0.4285 | | 0.3679 | 1.3258 | 60000 | 0.4329 | | 0.3921 | 1.3810 | 62500 | 0.4257 | | 0.3587 | 1.4363 | 65000 | 0.4248 | | 0.3502 | 1.4915 | 67500 | 0.4283 | | 0.3768 | 1.5468 | 70000 | 0.4283 | | 0.3461 | 1.6020 | 72500 | 0.4226 | | 0.3524 | 1.6572 | 75000 | 0.4238 | | 0.3838 | 1.7125 | 77500 | 0.4220 | | 0.3849 | 1.7677 | 80000 | 0.4213 | | 0.3731 | 1.8230 | 82500 | 0.4184 | | 0.3722 | 1.8782 | 85000 | 0.4212 | | 0.3762 | 1.9334 | 87500 | 0.4179 | | 0.3737 | 1.9887 | 90000 | 0.4229 | | 0.3311 | 2.0439 | 92500 | 0.4277 | | 0.3308 | 2.0992 | 95000 | 0.4245 | | 0.3454 | 2.1544 | 97500 | 0.4258 | | 0.2972 | 2.2097 | 100000 | 0.4362 | | 0.3284 | 2.2649 | 102500 | 0.4290 | | 0.3774 | 2.3201 | 105000 | 0.4302 | | 0.3287 | 2.3754 | 107500 | 0.4250 | | 0.3281 | 2.4306 | 110000 | 0.4219 | | 0.3312 | 2.4859 | 112500 | 0.4249 | | 0.3651 | 2.5411 | 115000 | 0.4222 | | 0.3639 | 2.5963 | 117500 | 0.4243 | | 0.3602 | 2.6516 | 120000 | 0.4187 | | 0.3222 | 2.7068 | 122500 | 0.4256 | | 0.3474 | 2.7621 | 125000 | 0.4204 | | 0.3317 | 2.8173 | 127500 | 0.4246 | | 0.3616 | 2.8725 | 130000 | 0.4148 | | 0.3729 | 2.9278 | 132500 | 0.4191 | | 0.352 | 2.9830 | 135000 | 0.4184 | | 0.2849 | 3.0383 | 137500 | 0.4272 | | 0.3148 | 3.0935 | 140000 | 0.4285 | | 0.3032 | 3.1488 | 142500 | 0.4324 | | 0.3306 | 3.2040 | 145000 | 0.4238 | | 0.3377 | 3.2592 | 147500 | 0.4264 | | 0.3373 | 3.3145 | 150000 | 0.4254 | | 0.3079 | 3.3697 | 152500 | 0.4267 | | 0.3165 | 3.4250 | 155000 | 0.4239 | | 0.3469 | 3.4802 | 157500 | 0.4225 | | 0.3102 | 3.5354 | 160000 | 0.4194 | | 0.3231 | 3.5907 | 162500 | 0.4199 | | 0.3383 | 3.6459 | 165000 | 0.4210 | | 0.3048 | 3.7012 | 167500 | 0.4188 | | 0.3222 | 3.7564 | 170000 | 0.4206 | | 0.3505 | 3.8116 | 172500 | 0.4202 | | 0.3209 | 3.8669 | 175000 | 0.4172 | | 0.3146 | 3.9221 | 177500 | 0.4197 | | 0.3237 | 3.9774 | 180000 | 0.4204 | | 0.3087 | 4.0326 | 182500 | 0.4298 | | 0.2979 | 4.0879 | 185000 | 0.4278 | | 0.3046 | 4.1431 | 187500 | 0.4203 | | 0.3145 | 4.1983 | 190000 | 0.4273 | | 0.3511 | 4.2536 | 192500 | 0.4282 | | 0.3845 | 4.3088 | 195000 | 0.4255 | | 0.2889 | 4.3641 | 197500 | 0.4261 | | 0.2764 | 4.4193 | 200000 | 0.4269 | | 0.3089 | 4.4745 | 202500 | 0.4280 | | 0.2928 | 4.5298 | 205000 | 0.4216 | | 0.2982 | 4.5850 | 207500 | 0.4294 | | 0.3008 | 4.6403 | 210000 | 0.4240 | | 0.2997 | 4.6955 | 212500 | 0.4239 | | 0.2964 | 4.7508 | 215000 | 0.4215 | | 0.2822 | 4.8060 | 217500 | 0.4214 | | 0.3216 | 4.8612 | 220000 | 0.4219 | | 0.2873 | 4.9165 | 222500 | 0.4197 | | 0.314 | 4.9717 | 225000 | 0.4214 | | 0.3212 | 5.0270 | 227500 | 0.4292 | | 0.2883 | 5.0822 | 230000 | 0.4333 | | 0.2828 | 5.1374 | 232500 | 0.4341 | | 0.2498 | 5.1927 | 235000 | 0.4357 | | 0.2823 | 5.2479 | 237500 | 0.4289 | | 0.2775 | 5.3032 | 240000 | 0.4352 | | 0.3022 | 5.3584 | 242500 | 0.4329 | | 0.269 | 5.4136 | 245000 | 0.4336 | | 0.2769 | 5.4689 | 247500 | 0.4291 | | 0.2627 | 5.5241 | 250000 | 0.4328 | | 0.2632 | 5.5794 | 252500 | 0.4298 | | 0.2856 | 5.6346 | 255000 | 0.4338 | | 0.3124 | 5.6899 | 257500 | 0.4288 | | 0.2662 | 5.7451 | 260000 | 0.4280 | | 0.2849 | 5.8003 | 262500 | 0.4303 | | 0.2972 | 5.8556 | 265000 | 0.4253 | | 0.2866 | 5.9108 | 267500 | 0.4252 | | 0.2689 | 5.9661 | 270000 | 0.4204 | | 0.2459 | 6.0213 | 272500 | 0.4355 | | 0.281 | 6.0765 | 275000 | 0.4386 | | 0.29 | 6.1318 | 277500 | 0.4396 | | 0.2587 | 6.1870 | 280000 | 0.4383 | | 0.2892 | 6.2423 | 282500 | 0.4393 | | 0.2761 | 6.2975 | 285000 | 0.4393 | | 0.2796 | 6.3527 | 287500 | 0.4378 | | 0.2586 | 6.4080 | 290000 | 0.4330 | | 0.2397 | 6.4632 | 292500 | 0.4412 | | 0.2823 | 6.5185 | 295000 | 0.4306 | | 0.2903 | 6.5737 | 297500 | 0.4351 | | 0.2675 | 6.6290 | 300000 | 0.4369 | | 0.2949 | 6.6842 | 302500 | 0.4438 | | 0.284 | 6.7394 | 305000 | 0.4361 | | 0.2794 | 6.7947 | 307500 | 0.4304 | | 0.2475 | 6.8499 | 310000 | 0.4399 | | 0.2804 | 6.9052 | 312500 | 0.4317 | | 0.2634 | 6.9604 | 315000 | 0.4359 | | 0.2447 | 7.0156 | 317500 | 0.4418 | | 0.2582 | 7.0709 | 320000 | 0.4471 | | 0.2468 | 7.1261 | 322500 | 0.4492 | | 0.2584 | 7.1814 | 325000 | 0.4436 | | 0.2619 | 7.2366 | 327500 | 0.4444 | | 0.2273 | 7.2919 | 330000 | 0.4458 | | 0.2385 | 7.3471 | 332500 | 0.4434 | | 0.2324 | 7.4023 | 335000 | 0.4470 | | 0.2475 | 7.4576 | 337500 | 0.4475 | | 0.2591 | 7.5128 | 340000 | 0.4456 | | 0.2565 | 7.5681 | 342500 | 0.4451 | | 0.2258 | 7.6233 | 345000 | 0.4424 | | 0.2253 | 7.6785 | 347500 | 0.4444 | | 0.2418 | 7.7338 | 350000 | 0.4470 | | 0.2608 | 7.7890 | 352500 | 0.4465 | | 0.2497 | 7.8443 | 355000 | 0.4472 | | 0.2516 | 7.8995 | 357500 | 0.4446 | | 0.2423 | 7.9547 | 360000 | 0.4426 | | 0.2711 | 8.0100 | 362500 | 0.4470 | | 0.2386 | 8.0652 | 365000 | 0.4530 | | 0.2317 | 8.1205 | 367500 | 0.4550 | | 0.243 | 8.1757 | 370000 | 0.4560 | | 0.2273 | 8.2310 | 372500 | 0.4523 | | 0.2463 | 8.2862 | 375000 | 0.4534 | | 0.2435 | 8.3414 | 377500 | 0.4520 | | 0.2805 | 8.3967 | 380000 | 0.4541 | | 0.2437 | 8.4519 | 382500 | 0.4548 | | 0.2583 | 8.5072 | 385000 | 0.4531 | | 0.2241 | 8.5624 | 387500 | 0.4502 | | 0.2531 | 8.6176 | 390000 | 0.4551 | | 0.2393 | 8.6729 | 392500 | 0.4524 | | 0.2506 | 8.7281 | 395000 | 0.4525 | | 0.2222 | 8.7834 | 397500 | 0.4533 | | 0.251 | 8.8386 | 400000 | 0.4518 | | 0.2331 | 8.8938 | 402500 | 0.4555 | | 0.2312 | 8.9491 | 405000 | 0.4507 | | 0.2399 | 9.0043 | 407500 | 0.4557 | | 0.2267 | 9.0596 | 410000 | 0.4574 | | 0.2336 | 9.1148 | 412500 | 0.4580 | | 0.263 | 9.1701 | 415000 | 0.4567 | | 0.2207 | 9.2253 | 417500 | 0.4589 | | 0.2457 | 9.2805 | 420000 | 0.4624 | | 0.2577 | 9.3358 | 422500 | 0.4583 | | 0.19 | 9.3910 | 425000 | 0.4600 | | 0.2513 | 9.4463 | 427500 | 0.4575 | | 0.2647 | 9.5015 | 430000 | 0.4587 | | 0.2704 | 9.5567 | 432500 | 0.4577 | | 0.2397 | 9.6120 | 435000 | 0.4592 | | 0.2436 | 9.6672 | 437500 | 0.4601 | | 0.2595 | 9.7225 | 440000 | 0.4591 | | 0.2617 | 9.7777 | 442500 | 0.4595 | | 0.231 | 9.8330 | 445000 | 0.4604 | | 0.2375 | 9.8882 | 447500 | 0.4594 | | 0.2295 | 9.9434 | 450000 | 0.4597 | | 0.2289 | 9.9987 | 452500 | 0.4597 | ### Framework versions - Transformers 4.44.0.dev0 - Pytorch 2.5.0.dev20240625 - Datasets 2.20.0 - Tokenizers 0.19.1