metadata

base_model: unsloth/gemma-2-9b
library_name: peft
license: gemma
tags:
  - unsloth
  - generated_from_trainer
model-index:
  - name: gemma-2-9b_metamath_ortho
    results: []

gemma-2-9b_metamath_ortho

This model is a fine-tuned version of unsloth/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 10.4991

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 32
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.02
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss
0.7559	0.0211	13	11.2124
11.6216	0.0421	26	11.5534
11.3932	0.0632	39	11.2944
10.8497	0.0843	52	10.5202
10.6949	0.1053	65	10.3737
10.205	0.1264	78	10.8305
10.3395	0.1474	91	11.1255
10.8575	0.1685	104	10.1467
10.5782	0.1896	117	10.6096
10.5443	0.2106	130	10.3814
10.4871	0.2317	143	10.6978
10.4177	0.2528	156	10.8357
9.3249	0.2738	169	9.7925
11.959	0.2949	182	12.0069
12.0051	0.3159	195	11.9632
11.9939	0.3370	208	11.9428
11.711	0.3581	221	11.8586
11.7235	0.3791	234	11.4550
11.4001	0.4002	247	11.5482
11.3359	0.4213	260	11.1959
11.5279	0.4423	273	11.5632
11.3903	0.4634	286	11.0193
11.2189	0.4845	299	11.1413
11.0779	0.5055	312	10.9722
11.0063	0.5266	325	10.8490
11.1246	0.5476	338	11.1407
10.9858	0.5687	351	10.7436
10.6825	0.5898	364	11.2291
11.3333	0.6108	377	11.0195
10.9347	0.6319	390	10.9265
10.7369	0.6530	403	10.7926
10.7869	0.6740	416	10.7711
10.7802	0.6951	429	10.6187
10.8631	0.7162	442	10.8749
10.6492	0.7372	455	10.6884
10.7395	0.7583	468	10.6685
10.7239	0.7793	481	10.5783
10.5349	0.8004	494	10.4897
10.3873	0.8215	507	10.6675
10.4529	0.8425	520	10.5227
10.4378	0.8636	533	10.4906
10.5446	0.8847	546	10.5181
10.469	0.9057	559	10.4954
10.4688	0.9268	572	10.4928
10.4527	0.9478	585	10.4564
10.4343	0.9689	598	10.4873
10.4796	0.9900	611	10.4991

Framework versions

PEFT 0.12.0
Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1