imdatta0's picture
End of training
bb8079f verified
|
raw
history blame
3.83 kB
metadata
base_model: unsloth/gemma-2-9b
library_name: peft
license: gemma
tags:
  - unsloth
  - generated_from_trainer
model-index:
  - name: gemma-2-9b_metamath_ortho
    results: []

gemma-2-9b_metamath_ortho

This model is a fine-tuned version of unsloth/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 10.4991

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.7559 0.0211 13 11.2124
11.6216 0.0421 26 11.5534
11.3932 0.0632 39 11.2944
10.8497 0.0843 52 10.5202
10.6949 0.1053 65 10.3737
10.205 0.1264 78 10.8305
10.3395 0.1474 91 11.1255
10.8575 0.1685 104 10.1467
10.5782 0.1896 117 10.6096
10.5443 0.2106 130 10.3814
10.4871 0.2317 143 10.6978
10.4177 0.2528 156 10.8357
9.3249 0.2738 169 9.7925
11.959 0.2949 182 12.0069
12.0051 0.3159 195 11.9632
11.9939 0.3370 208 11.9428
11.711 0.3581 221 11.8586
11.7235 0.3791 234 11.4550
11.4001 0.4002 247 11.5482
11.3359 0.4213 260 11.1959
11.5279 0.4423 273 11.5632
11.3903 0.4634 286 11.0193
11.2189 0.4845 299 11.1413
11.0779 0.5055 312 10.9722
11.0063 0.5266 325 10.8490
11.1246 0.5476 338 11.1407
10.9858 0.5687 351 10.7436
10.6825 0.5898 364 11.2291
11.3333 0.6108 377 11.0195
10.9347 0.6319 390 10.9265
10.7369 0.6530 403 10.7926
10.7869 0.6740 416 10.7711
10.7802 0.6951 429 10.6187
10.8631 0.7162 442 10.8749
10.6492 0.7372 455 10.6884
10.7395 0.7583 468 10.6685
10.7239 0.7793 481 10.5783
10.5349 0.8004 494 10.4897
10.3873 0.8215 507 10.6675
10.4529 0.8425 520 10.5227
10.4378 0.8636 533 10.4906
10.5446 0.8847 546 10.5181
10.469 0.9057 559 10.4954
10.4688 0.9268 572 10.4928
10.4527 0.9478 585 10.4564
10.4343 0.9689 598 10.4873
10.4796 0.9900 611 10.4991

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1