my_awesome_power_model_llmv2
This model is a fine-tuned version of gpt2 on an unknown dataset. It achieves the following results on the evaluation set:
- Train Loss: 0.0347
- Epoch: 599
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 5e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32
Training results
Train Loss | Epoch |
---|---|
14.1299 | 0 |
3.0898 | 1 |
2.8086 | 2 |
2.6899 | 3 |
2.5834 | 4 |
2.5116 | 5 |
2.4435 | 6 |
2.3961 | 7 |
2.3446 | 8 |
2.3011 | 9 |
2.2651 | 10 |
2.2280 | 11 |
2.2007 | 12 |
2.1640 | 13 |
2.1350 | 14 |
2.1105 | 15 |
2.0776 | 16 |
2.0486 | 17 |
2.0297 | 18 |
2.0114 | 19 |
1.9887 | 20 |
1.9679 | 21 |
1.9495 | 22 |
1.9376 | 23 |
1.9145 | 24 |
1.9036 | 25 |
1.8915 | 26 |
1.8738 | 27 |
1.8624 | 28 |
1.8496 | 29 |
1.8310 | 30 |
1.8196 | 31 |
1.8074 | 32 |
1.8021 | 33 |
1.7813 | 34 |
1.7681 | 35 |
1.7548 | 36 |
1.7386 | 37 |
1.7325 | 38 |
1.7149 | 39 |
1.7051 | 40 |
1.7001 | 41 |
1.6815 | 42 |
1.6765 | 43 |
1.6667 | 44 |
1.6528 | 45 |
1.6373 | 46 |
1.6269 | 47 |
1.6237 | 48 |
1.6046 | 49 |
1.6005 | 50 |
1.5919 | 51 |
1.5767 | 52 |
1.5617 | 53 |
1.5556 | 54 |
1.5461 | 55 |
1.5311 | 56 |
1.5313 | 57 |
1.5116 | 58 |
1.5020 | 59 |
1.4975 | 60 |
1.4897 | 61 |
1.4834 | 62 |
1.4677 | 63 |
1.4672 | 64 |
1.4470 | 65 |
1.4409 | 66 |
1.4284 | 67 |
1.4202 | 68 |
1.4174 | 69 |
1.4007 | 70 |
1.3930 | 71 |
1.3868 | 72 |
1.3702 | 73 |
1.3636 | 74 |
1.3557 | 75 |
1.3417 | 76 |
1.3321 | 77 |
1.3206 | 78 |
1.3135 | 79 |
1.3087 | 80 |
1.2974 | 81 |
1.2856 | 82 |
1.2734 | 83 |
1.2660 | 84 |
1.2571 | 85 |
1.2528 | 86 |
1.2330 | 87 |
1.2214 | 88 |
1.2126 | 89 |
1.2075 | 90 |
1.1932 | 91 |
1.1928 | 92 |
1.1717 | 93 |
1.1691 | 94 |
1.1618 | 95 |
1.1453 | 96 |
1.1308 | 97 |
1.1287 | 98 |
1.1187 | 99 |
1.1003 | 100 |
1.0947 | 101 |
1.0822 | 102 |
1.0749 | 103 |
1.0659 | 104 |
1.0546 | 105 |
1.0412 | 106 |
1.0274 | 107 |
1.0248 | 108 |
1.0100 | 109 |
1.0050 | 110 |
0.9935 | 111 |
0.9798 | 112 |
0.9733 | 113 |
0.9604 | 114 |
0.9530 | 115 |
0.9407 | 116 |
0.9290 | 117 |
0.9217 | 118 |
0.9095 | 119 |
0.8929 | 120 |
0.8860 | 121 |
0.8786 | 122 |
0.8684 | 123 |
0.8585 | 124 |
0.8445 | 125 |
0.8398 | 126 |
0.8181 | 127 |
0.8183 | 128 |
0.8030 | 129 |
0.7919 | 130 |
0.7851 | 131 |
0.7743 | 132 |
0.7578 | 133 |
0.7449 | 134 |
0.7329 | 135 |
0.7267 | 136 |
0.7178 | 137 |
0.7089 | 138 |
0.7000 | 139 |
0.6948 | 140 |
0.6842 | 141 |
0.6637 | 142 |
0.6546 | 143 |
0.6454 | 144 |
0.6348 | 145 |
0.6270 | 146 |
0.6150 | 147 |
0.6002 | 148 |
0.5899 | 149 |
0.5803 | 150 |
0.5709 | 151 |
0.5600 | 152 |
0.5534 | 153 |
0.5429 | 154 |
0.5266 | 155 |
0.5207 | 156 |
0.5096 | 157 |
0.4978 | 158 |
0.4878 | 159 |
0.4752 | 160 |
0.4752 | 161 |
0.4633 | 162 |
0.4580 | 163 |
0.4411 | 164 |
0.4268 | 165 |
0.4262 | 166 |
0.4107 | 167 |
0.4053 | 168 |
0.3935 | 169 |
0.4129 | 170 |
0.3874 | 171 |
0.3766 | 172 |
0.3688 | 173 |
0.3505 | 174 |
0.3534 | 175 |
0.3403 | 176 |
0.3310 | 177 |
0.3242 | 178 |
0.3188 | 179 |
0.3130 | 180 |
0.3023 | 181 |
0.2953 | 182 |
0.2907 | 183 |
0.2819 | 184 |
0.2731 | 185 |
0.2706 | 186 |
0.2671 | 187 |
0.2567 | 188 |
0.2512 | 189 |
0.2441 | 190 |
0.2428 | 191 |
0.2378 | 192 |
0.2322 | 193 |
0.2246 | 194 |
0.2223 | 195 |
0.2196 | 196 |
0.2091 | 197 |
0.2052 | 198 |
0.2019 | 199 |
0.2011 | 200 |
0.1975 | 201 |
0.1963 | 202 |
0.1917 | 203 |
0.1898 | 204 |
0.1829 | 205 |
0.1791 | 206 |
0.1733 | 207 |
0.1706 | 208 |
0.1683 | 209 |
0.1646 | 210 |
0.1645 | 211 |
0.1581 | 212 |
0.1533 | 213 |
0.1568 | 214 |
0.1499 | 215 |
0.1490 | 216 |
0.1460 | 217 |
0.1426 | 218 |
0.1444 | 219 |
0.1391 | 220 |
0.1390 | 221 |
0.1380 | 222 |
0.1336 | 223 |
0.1322 | 224 |
0.1316 | 225 |
0.1262 | 226 |
0.1231 | 227 |
0.1235 | 228 |
0.1260 | 229 |
0.1242 | 230 |
0.1218 | 231 |
0.1167 | 232 |
0.1174 | 233 |
0.1169 | 234 |
0.1164 | 235 |
0.1133 | 236 |
0.1138 | 237 |
0.1100 | 238 |
0.1107 | 239 |
0.1079 | 240 |
0.1059 | 241 |
0.1068 | 242 |
0.1023 | 243 |
0.1063 | 244 |
0.1005 | 245 |
0.1014 | 246 |
0.1004 | 247 |
0.0994 | 248 |
0.1061 | 249 |
0.1004 | 250 |
0.0942 | 251 |
0.0975 | 252 |
0.0957 | 253 |
0.0933 | 254 |
0.0924 | 255 |
0.0921 | 256 |
0.0912 | 257 |
0.0897 | 258 |
0.0893 | 259 |
0.0835 | 260 |
0.0861 | 261 |
0.0860 | 262 |
0.0819 | 263 |
0.0830 | 264 |
0.0823 | 265 |
0.0836 | 266 |
0.0800 | 267 |
0.0797 | 268 |
0.0808 | 269 |
0.0785 | 270 |
0.0770 | 271 |
0.0776 | 272 |
0.0780 | 273 |
0.0744 | 274 |
0.0790 | 275 |
0.0765 | 276 |
0.0769 | 277 |
0.0725 | 278 |
0.0740 | 279 |
0.0718 | 280 |
0.0760 | 281 |
0.0741 | 282 |
0.0728 | 283 |
0.0721 | 284 |
0.0726 | 285 |
0.0691 | 286 |
0.0709 | 287 |
0.0710 | 288 |
0.0666 | 289 |
0.0675 | 290 |
0.0690 | 291 |
0.0720 | 292 |
0.0693 | 293 |
0.0685 | 294 |
0.0649 | 295 |
0.0666 | 296 |
0.0669 | 297 |
0.0662 | 298 |
0.0648 | 299 |
0.0663 | 300 |
0.0660 | 301 |
0.0638 | 302 |
0.0628 | 303 |
0.0621 | 304 |
0.0631 | 305 |
0.0611 | 306 |
0.0640 | 307 |
0.0622 | 308 |
0.0643 | 309 |
0.0622 | 310 |
0.0623 | 311 |
0.0607 | 312 |
0.0603 | 313 |
0.0591 | 314 |
0.0620 | 315 |
0.0609 | 316 |
0.0596 | 317 |
0.0594 | 318 |
0.0608 | 319 |
0.0606 | 320 |
0.0587 | 321 |
0.0620 | 322 |
0.0601 | 323 |
0.0590 | 324 |
0.0600 | 325 |
0.0576 | 326 |
0.0581 | 327 |
0.0556 | 328 |
0.0588 | 329 |
0.0561 | 330 |
0.0563 | 331 |
0.0554 | 332 |
0.0596 | 333 |
0.0570 | 334 |
0.0570 | 335 |
0.0552 | 336 |
0.0566 | 337 |
0.0526 | 338 |
0.0528 | 339 |
0.0527 | 340 |
0.0554 | 341 |
0.0574 | 342 |
0.0543 | 343 |
0.0553 | 344 |
0.0530 | 345 |
0.0537 | 346 |
0.0537 | 347 |
0.0536 | 348 |
0.0526 | 349 |
0.0512 | 350 |
0.0506 | 351 |
0.0510 | 352 |
0.0514 | 353 |
0.0496 | 354 |
0.0500 | 355 |
0.0525 | 356 |
0.0533 | 357 |
0.0509 | 358 |
0.0520 | 359 |
0.0523 | 360 |
0.0508 | 361 |
0.0517 | 362 |
0.0513 | 363 |
0.0519 | 364 |
0.0505 | 365 |
0.0490 | 366 |
0.0496 | 367 |
0.0504 | 368 |
0.0467 | 369 |
0.0481 | 370 |
0.0465 | 371 |
0.0480 | 372 |
0.0450 | 373 |
0.0481 | 374 |
0.0515 | 375 |
0.0489 | 376 |
0.0488 | 377 |
0.0481 | 378 |
0.0483 | 379 |
0.0480 | 380 |
0.0490 | 381 |
0.0476 | 382 |
0.0469 | 383 |
0.0489 | 384 |
0.0478 | 385 |
0.0456 | 386 |
0.0465 | 387 |
0.0467 | 388 |
0.0494 | 389 |
0.0506 | 390 |
0.0477 | 391 |
0.0483 | 392 |
0.0449 | 393 |
0.0471 | 394 |
0.0444 | 395 |
0.0469 | 396 |
0.0481 | 397 |
0.0456 | 398 |
0.0448 | 399 |
0.0435 | 400 |
0.0430 | 401 |
0.0441 | 402 |
0.0445 | 403 |
0.0464 | 404 |
0.0469 | 405 |
0.0443 | 406 |
0.0472 | 407 |
0.0458 | 408 |
0.0445 | 409 |
0.0438 | 410 |
0.0443 | 411 |
0.0447 | 412 |
0.0445 | 413 |
0.0436 | 414 |
0.0435 | 415 |
0.0427 | 416 |
0.0429 | 417 |
0.0430 | 418 |
0.0437 | 419 |
0.0445 | 420 |
0.0427 | 421 |
0.0447 | 422 |
0.0447 | 423 |
0.0436 | 424 |
0.0449 | 425 |
0.0445 | 426 |
0.0444 | 427 |
0.0439 | 428 |
0.0426 | 429 |
0.0440 | 430 |
0.0425 | 431 |
0.0418 | 432 |
0.0423 | 433 |
0.0437 | 434 |
0.0431 | 435 |
0.0430 | 436 |
0.0398 | 437 |
0.0405 | 438 |
0.0398 | 439 |
0.0416 | 440 |
0.0407 | 441 |
0.0413 | 442 |
0.0428 | 443 |
0.0414 | 444 |
0.0435 | 445 |
0.0425 | 446 |
0.0411 | 447 |
0.0414 | 448 |
0.0415 | 449 |
0.0436 | 450 |
0.0424 | 451 |
0.0429 | 452 |
0.0400 | 453 |
0.0414 | 454 |
0.0393 | 455 |
0.0389 | 456 |
0.0395 | 457 |
0.0403 | 458 |
0.0386 | 459 |
0.0399 | 460 |
0.0390 | 461 |
0.0379 | 462 |
0.0403 | 463 |
0.0400 | 464 |
0.0396 | 465 |
0.0394 | 466 |
0.0387 | 467 |
0.0401 | 468 |
0.0394 | 469 |
0.0392 | 470 |
0.0418 | 471 |
0.0407 | 472 |
0.0392 | 473 |
0.0414 | 474 |
0.0406 | 475 |
0.0407 | 476 |
0.0409 | 477 |
0.0393 | 478 |
0.0411 | 479 |
0.0399 | 480 |
0.0398 | 481 |
0.0403 | 482 |
0.0382 | 483 |
0.0381 | 484 |
0.0373 | 485 |
0.0390 | 486 |
0.0375 | 487 |
0.0371 | 488 |
0.0393 | 489 |
0.0382 | 490 |
0.0397 | 491 |
0.0389 | 492 |
0.0400 | 493 |
0.0387 | 494 |
0.0388 | 495 |
0.0383 | 496 |
0.0366 | 497 |
0.0380 | 498 |
0.0379 | 499 |
0.0390 | 500 |
0.0401 | 501 |
0.0392 | 502 |
0.0368 | 503 |
0.0386 | 504 |
0.0369 | 505 |
0.0373 | 506 |
0.0376 | 507 |
0.0380 | 508 |
0.0374 | 509 |
0.0401 | 510 |
0.0391 | 511 |
0.0373 | 512 |
0.0383 | 513 |
0.0372 | 514 |
0.0378 | 515 |
0.0384 | 516 |
0.0371 | 517 |
0.0359 | 518 |
0.0354 | 519 |
0.0366 | 520 |
0.0442 | 521 |
0.0393 | 522 |
0.0378 | 523 |
0.0370 | 524 |
0.0382 | 525 |
0.0366 | 526 |
0.0380 | 527 |
0.0370 | 528 |
0.0393 | 529 |
0.0361 | 530 |
0.0364 | 531 |
0.0390 | 532 |
0.0371 | 533 |
0.0367 | 534 |
0.0376 | 535 |
0.0365 | 536 |
0.0371 | 537 |
0.0374 | 538 |
0.0378 | 539 |
0.0355 | 540 |
0.0352 | 541 |
0.0342 | 542 |
0.0348 | 543 |
0.0361 | 544 |
0.0380 | 545 |
0.0367 | 546 |
0.0354 | 547 |
0.0341 | 548 |
0.0352 | 549 |
0.0344 | 550 |
0.0348 | 551 |
0.0354 | 552 |
0.0370 | 553 |
0.0379 | 554 |
0.0362 | 555 |
0.0366 | 556 |
0.0369 | 557 |
0.0355 | 558 |
0.0359 | 559 |
0.0371 | 560 |
0.0359 | 561 |
0.0344 | 562 |
0.0355 | 563 |
0.0361 | 564 |
0.0345 | 565 |
0.0345 | 566 |
0.0348 | 567 |
0.0343 | 568 |
0.0340 | 569 |
0.0351 | 570 |
0.0344 | 571 |
0.0341 | 572 |
0.0350 | 573 |
0.0341 | 574 |
0.0347 | 575 |
0.0336 | 576 |
0.0339 | 577 |
0.0334 | 578 |
0.0340 | 579 |
0.0349 | 580 |
0.0356 | 581 |
0.0353 | 582 |
0.0356 | 583 |
0.0369 | 584 |
0.0360 | 585 |
0.0358 | 586 |
0.0354 | 587 |
0.0350 | 588 |
0.0359 | 589 |
0.0363 | 590 |
0.0342 | 591 |
0.0355 | 592 |
0.0352 | 593 |
0.0337 | 594 |
0.0333 | 595 |
0.0343 | 596 |
0.0352 | 597 |
0.0333 | 598 |
0.0347 | 599 |
Framework versions
- Transformers 4.35.2
- TensorFlow 2.15.0
- Datasets 2.16.1
- Tokenizers 0.15.1
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for MohamedAAK/my_awesome_power_model_llmv2
Base model
openai-community/gpt2