--- base_model: google/gemma-7b library_name: peft license: gemma tags: - trl - sft - generated_from_trainer model-index: - name: gemma7bit-lora-sql results: [] --- # gemma7bit-lora-sql This model is a fine-tuned version of [google/gemma-7b](https://huggingface.co./google/gemma-7b) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 40.8546 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0003 - train_batch_size: 1 - eval_batch_size: 8 - seed: 1399 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 2 - training_steps: 500 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 47.4397 | 0.0000 | 2 | 112.0961 | | 54.9563 | 0.0001 | 4 | 113.0320 | | 43.0701 | 0.0001 | 6 | 105.0883 | | 29.3374 | 0.0001 | 8 | 93.7564 | | 24.013 | 0.0001 | 10 | 70.5026 | | 5.7244 | 0.0002 | 12 | 70.3644 | | 6.7112 | 0.0002 | 14 | 69.0918 | | 5.139 | 0.0002 | 16 | 67.7594 | | 5.658 | 0.0002 | 18 | 64.8925 | | 3.348 | 0.0003 | 20 | 62.9086 | | 3.0009 | 0.0003 | 22 | 54.9081 | | 3.1078 | 0.0003 | 24 | 47.0123 | | 2.9829 | 0.0003 | 26 | 44.8515 | | 2.4287 | 0.0004 | 28 | 42.1563 | | 2.1561 | 0.0004 | 30 | 39.5831 | | 2.3805 | 0.0004 | 32 | 37.8210 | | 4.199 | 0.0004 | 34 | 36.5321 | | 4.2891 | 0.0005 | 36 | 35.5581 | | 2.8376 | 0.0005 | 38 | 35.1185 | | 2.4216 | 0.0005 | 40 | 35.1674 | | 2.2408 | 0.0005 | 42 | 34.9562 | | 3.4941 | 0.0006 | 44 | 35.2440 | | 3.4866 | 0.0006 | 46 | 34.5079 | | 2.2815 | 0.0006 | 48 | 34.1046 | | 2.2584 | 0.0006 | 50 | 34.0249 | | 2.7932 | 0.0007 | 52 | 34.8069 | | 2.8995 | 0.0007 | 54 | 35.0606 | | 3.3107 | 0.0007 | 56 | 35.8230 | | 3.0793 | 0.0007 | 58 | 36.0362 | | 4.5829 | 0.0008 | 60 | 34.8489 | | 2.6841 | 0.0008 | 62 | 33.6494 | | 3.5738 | 0.0008 | 64 | 32.4676 | | 2.955 | 0.0008 | 66 | 31.9876 | | 2.1847 | 0.0009 | 68 | 31.4324 | | 3.5749 | 0.0009 | 70 | 31.4434 | | 2.0652 | 0.0009 | 72 | 31.6449 | | 1.9506 | 0.0009 | 74 | 31.8311 | | 2.6852 | 0.0010 | 76 | 32.0123 | | 1.8463 | 0.0010 | 78 | 32.2012 | | 2.4999 | 0.0010 | 80 | 32.4074 | | 1.7525 | 0.0010 | 82 | 32.5013 | | 1.865 | 0.0011 | 84 | 32.7458 | | 2.5512 | 0.0011 | 86 | 32.9542 | | 2.041 | 0.0011 | 88 | 33.7792 | | 3.4588 | 0.0011 | 90 | 33.5860 | | 2.2258 | 0.0012 | 92 | 33.9242 | | 2.1416 | 0.0012 | 94 | 34.2110 | | 1.9904 | 0.0012 | 96 | 34.1852 | | 1.9793 | 0.0012 | 98 | 34.1257 | | 3.3329 | 0.0013 | 100 | 34.2512 | | 2.6011 | 0.0013 | 102 | 34.4635 | | 2.4212 | 0.0013 | 104 | 34.5869 | | 1.941 | 0.0014 | 106 | 34.7022 | | 2.4623 | 0.0014 | 108 | 34.9359 | | 2.4267 | 0.0014 | 110 | 35.1085 | | 1.7913 | 0.0014 | 112 | 35.1962 | | 1.6845 | 0.0015 | 114 | 35.5859 | | 3.0888 | 0.0015 | 116 | 35.8237 | | 3.4959 | 0.0015 | 118 | 35.4403 | | 2.5661 | 0.0015 | 120 | 35.3171 | | 2.4044 | 0.0016 | 122 | 35.1409 | | 3.1554 | 0.0016 | 124 | 35.0385 | | 2.0637 | 0.0016 | 126 | 35.4118 | | 5.6131 | 0.0016 | 128 | 35.2343 | | 3.0214 | 0.0017 | 130 | 35.9148 | | 1.771 | 0.0017 | 132 | 36.5919 | | 2.4126 | 0.0017 | 134 | 36.8129 | | 2.5102 | 0.0017 | 136 | 36.6166 | | 6.5612 | 0.0018 | 138 | 36.9545 | | 2.1154 | 0.0018 | 140 | 36.8204 | | 2.533 | 0.0018 | 142 | 36.5374 | | 1.7012 | 0.0018 | 144 | 36.6904 | | 2.2287 | 0.0019 | 146 | 36.1521 | | 4.2646 | 0.0019 | 148 | 36.1889 | | 1.8624 | 0.0019 | 150 | 36.5876 | | 1.9946 | 0.0019 | 152 | 36.6302 | | 2.124 | 0.0020 | 154 | 36.6274 | | 3.01 | 0.0020 | 156 | 36.6652 | | 1.928 | 0.0020 | 158 | 37.0886 | | 2.6035 | 0.0020 | 160 | 37.2648 | | 2.2572 | 0.0021 | 162 | 37.4929 | | 1.5284 | 0.0021 | 164 | 37.7779 | | 1.1103 | 0.0021 | 166 | 37.9401 | | 2.4597 | 0.0021 | 168 | 37.7270 | | 2.4846 | 0.0022 | 170 | 37.4224 | | 2.6234 | 0.0022 | 172 | 36.6518 | | 2.4765 | 0.0022 | 174 | 36.2149 | | 2.0448 | 0.0022 | 176 | 35.9293 | | 2.2736 | 0.0023 | 178 | 35.5881 | | 2.7181 | 0.0023 | 180 | 35.3821 | | 1.9195 | 0.0023 | 182 | 35.2214 | | 2.9274 | 0.0023 | 184 | 35.0837 | | 3.191 | 0.0024 | 186 | 35.1131 | | 2.6804 | 0.0024 | 188 | 35.1649 | | 1.5547 | 0.0024 | 190 | 35.3133 | | 2.2601 | 0.0024 | 192 | 35.6737 | | 2.5229 | 0.0025 | 194 | 36.1338 | | 2.6806 | 0.0025 | 196 | 36.2942 | | 2.2258 | 0.0025 | 198 | 36.4748 | | 1.2856 | 0.0025 | 200 | 36.9566 | | 2.1439 | 0.0026 | 202 | 37.1834 | | 4.0704 | 0.0026 | 204 | 37.5976 | | 2.5138 | 0.0026 | 206 | 38.2877 | | 2.9025 | 0.0027 | 208 | 38.5739 | | 1.8761 | 0.0027 | 210 | 38.3348 | | 1.9228 | 0.0027 | 212 | 38.3183 | | 1.7924 | 0.0027 | 214 | 38.2928 | | 2.7619 | 0.0028 | 216 | 38.1185 | | 2.1031 | 0.0028 | 218 | 37.7249 | | 2.6893 | 0.0028 | 220 | 37.7826 | | 2.255 | 0.0028 | 222 | 37.7949 | | 2.754 | 0.0029 | 224 | 37.8576 | | 1.6294 | 0.0029 | 226 | 38.2263 | | 1.8586 | 0.0029 | 228 | 38.4837 | | 2.4252 | 0.0029 | 230 | 38.7646 | | 2.36 | 0.0030 | 232 | 38.9834 | | 1.4407 | 0.0030 | 234 | 39.1561 | | 1.6109 | 0.0030 | 236 | 39.3041 | | 2.2582 | 0.0030 | 238 | 39.3389 | | 2.8185 | 0.0031 | 240 | 39.5245 | | 1.6233 | 0.0031 | 242 | 39.3154 | | 2.4039 | 0.0031 | 244 | 39.0988 | | 1.7734 | 0.0031 | 246 | 39.0567 | | 1.4779 | 0.0032 | 248 | 39.0881 | | 2.7848 | 0.0032 | 250 | 38.9895 | | 2.2963 | 0.0032 | 252 | 39.2507 | | 2.0605 | 0.0032 | 254 | 39.3339 | | 3.3667 | 0.0033 | 256 | 39.5060 | | 2.9702 | 0.0033 | 258 | 39.5491 | | 2.6734 | 0.0033 | 260 | 39.7907 | | 2.4727 | 0.0033 | 262 | 40.1472 | | 2.7539 | 0.0034 | 264 | 40.4749 | | 1.601 | 0.0034 | 266 | 40.3649 | | 2.1531 | 0.0034 | 268 | 40.2932 | | 1.8656 | 0.0034 | 270 | 40.2728 | | 1.9617 | 0.0035 | 272 | 40.3498 | | 1.8911 | 0.0035 | 274 | 40.3157 | | 2.3878 | 0.0035 | 276 | 40.2882 | | 2.677 | 0.0035 | 278 | 40.4437 | | 2.8035 | 0.0036 | 280 | 40.2423 | | 1.7537 | 0.0036 | 282 | 40.0182 | | 1.5873 | 0.0036 | 284 | 39.8449 | | 1.7802 | 0.0036 | 286 | 39.7251 | | 2.1861 | 0.0037 | 288 | 39.3972 | | 1.9197 | 0.0037 | 290 | 39.4064 | | 2.6752 | 0.0037 | 292 | 39.4320 | | 1.7225 | 0.0037 | 294 | 39.4498 | | 1.7274 | 0.0038 | 296 | 39.4309 | | 3.9891 | 0.0038 | 298 | 40.1752 | | 2.5153 | 0.0038 | 300 | 40.9025 | | 2.0587 | 0.0038 | 302 | 41.4380 | | 2.3115 | 0.0039 | 304 | 41.9152 | | 1.8684 | 0.0039 | 306 | 42.4118 | | 2.0388 | 0.0039 | 308 | 42.8904 | | 2.9396 | 0.0040 | 310 | 43.0102 | | 1.5832 | 0.0040 | 312 | 43.0678 | | 1.897 | 0.0040 | 314 | 43.0292 | | 2.2008 | 0.0040 | 316 | 43.0302 | | 2.4185 | 0.0041 | 318 | 42.8252 | | 1.9265 | 0.0041 | 320 | 42.5088 | | 2.5759 | 0.0041 | 322 | 42.2636 | | 2.9898 | 0.0041 | 324 | 42.1571 | | 1.7106 | 0.0042 | 326 | 41.7366 | | 2.3907 | 0.0042 | 328 | 41.3667 | | 2.4861 | 0.0042 | 330 | 41.3056 | | 1.6998 | 0.0042 | 332 | 41.2167 | | 2.6034 | 0.0043 | 334 | 41.2615 | | 1.6455 | 0.0043 | 336 | 41.2327 | | 1.8484 | 0.0043 | 338 | 41.2317 | | 2.2123 | 0.0043 | 340 | 41.2374 | | 1.8939 | 0.0044 | 342 | 41.1753 | | 1.881 | 0.0044 | 344 | 41.1000 | | 1.5313 | 0.0044 | 346 | 40.9959 | | 2.3099 | 0.0044 | 348 | 40.9817 | | 2.2593 | 0.0045 | 350 | 40.9572 | | 2.2597 | 0.0045 | 352 | 40.9278 | | 2.1038 | 0.0045 | 354 | 40.8672 | | 1.6107 | 0.0045 | 356 | 40.6815 | | 2.0831 | 0.0046 | 358 | 40.5641 | | 2.2921 | 0.0046 | 360 | 40.5117 | | 2.3178 | 0.0046 | 362 | 40.5802 | | 1.6295 | 0.0046 | 364 | 40.4780 | | 2.038 | 0.0047 | 366 | 40.5544 | | 1.7012 | 0.0047 | 368 | 40.7328 | | 2.5292 | 0.0047 | 370 | 40.8337 | | 1.8677 | 0.0047 | 372 | 40.9356 | | 1.5897 | 0.0048 | 374 | 41.0250 | | 1.5096 | 0.0048 | 376 | 41.0558 | | 1.6413 | 0.0048 | 378 | 41.2060 | | 1.6334 | 0.0048 | 380 | 41.2175 | | 2.0367 | 0.0049 | 382 | 41.3215 | | 1.9155 | 0.0049 | 384 | 41.4322 | | 1.9553 | 0.0049 | 386 | 41.4096 | | 2.3982 | 0.0049 | 388 | 41.3870 | | 2.1094 | 0.0050 | 390 | 41.2572 | | 1.9943 | 0.0050 | 392 | 41.1927 | | 2.1017 | 0.0050 | 394 | 41.1805 | | 1.8297 | 0.0050 | 396 | 41.0817 | | 2.2271 | 0.0051 | 398 | 41.0460 | | 2.022 | 0.0051 | 400 | 41.0754 | | 1.8099 | 0.0051 | 402 | 41.0777 | | 2.0973 | 0.0051 | 404 | 41.1348 | | 2.03 | 0.0052 | 406 | 41.1109 | | 1.7342 | 0.0052 | 408 | 41.1719 | | 2.0422 | 0.0052 | 410 | 41.1616 | | 2.6192 | 0.0052 | 412 | 41.0411 | | 1.7107 | 0.0053 | 414 | 41.0704 | | 2.8018 | 0.0053 | 416 | 41.0641 | | 1.3767 | 0.0053 | 418 | 41.0719 | | 1.9952 | 0.0054 | 420 | 41.0151 | | 1.7584 | 0.0054 | 422 | 40.9978 | | 2.1318 | 0.0054 | 424 | 40.9933 | | 2.3412 | 0.0054 | 426 | 40.9837 | | 1.6604 | 0.0055 | 428 | 41.0310 | | 1.6301 | 0.0055 | 430 | 40.9782 | | 2.0232 | 0.0055 | 432 | 40.9377 | | 1.7096 | 0.0055 | 434 | 40.9645 | | 2.1696 | 0.0056 | 436 | 40.9631 | | 1.5297 | 0.0056 | 438 | 40.9690 | | 1.4017 | 0.0056 | 440 | 41.0132 | | 1.7817 | 0.0056 | 442 | 40.9486 | | 1.7264 | 0.0057 | 444 | 40.9499 | | 1.8601 | 0.0057 | 446 | 41.0064 | | 1.9614 | 0.0057 | 448 | 41.0266 | | 2.3045 | 0.0057 | 450 | 41.0035 | | 2.67 | 0.0058 | 452 | 41.0159 | | 1.5752 | 0.0058 | 454 | 40.9748 | | 1.7464 | 0.0058 | 456 | 40.9395 | | 1.9167 | 0.0058 | 458 | 40.9119 | | 1.8777 | 0.0059 | 460 | 40.9021 | | 1.5879 | 0.0059 | 462 | 40.9164 | | 1.942 | 0.0059 | 464 | 40.8847 | | 1.6303 | 0.0059 | 466 | 40.9104 | | 2.1252 | 0.0060 | 468 | 40.9000 | | 2.2879 | 0.0060 | 470 | 40.9209 | | 1.7646 | 0.0060 | 472 | 40.8601 | | 2.3169 | 0.0060 | 474 | 40.8726 | | 1.7797 | 0.0061 | 476 | 40.8563 | | 2.0428 | 0.0061 | 478 | 40.8609 | | 2.4124 | 0.0061 | 480 | 40.8663 | | 2.2955 | 0.0061 | 482 | 40.8601 | | 1.3035 | 0.0062 | 484 | 40.8517 | | 2.611 | 0.0062 | 486 | 40.8781 | | 2.0677 | 0.0062 | 488 | 40.8694 | | 2.1645 | 0.0062 | 490 | 40.8864 | | 2.0708 | 0.0063 | 492 | 40.8633 | | 1.663 | 0.0063 | 494 | 40.8689 | | 1.9784 | 0.0063 | 496 | 40.8672 | | 1.7215 | 0.0063 | 498 | 40.8439 | | 2.2366 | 0.0064 | 500 | 40.8546 | ### Framework versions - PEFT 0.12.0 - Transformers 4.44.0 - Pytorch 2.2.2+cu121 - Datasets 2.21.0 - Tokenizers 0.19.1