metadata

license: other
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: google/gemma-7b
model-index:
  - name: gemma-7b-spanishbillionwords
    results: []

gemma-7b-spanishbillionwords

This model is a fine-tuned version of google/gemma-7b on Spanish Billion Words. This is the base Gemma model fine-tuned to perform better on spanish language. It achieves the following results on the evaluation set:

Loss: 12.1686

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 1
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 4
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
training_steps: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Step	Validation Loss
3.7924	1	12.2361
4.4499	2	12.1524
3.9217	3	12.0710
4.3292	4	12.1710
6.6619	5	12.1710
4.4917	6	12.2628
4.8346	7	12.3997
3.6987	8	12.4212
6.0457	9	12.4049
3.7882	10	12.4228
3.9878	11	12.4168
5.1707	12	12.3961
3.7024	13	12.3430
5.8496	14	12.3009
5.1708	15	12.2863
4.9796	16	12.2789
4.3754	17	12.2600
4.8339	18	12.2371
4.0352	19	12.2284
3.9643	20	12.2266
3.6923	21	12.2103
4.8213	22	12.2015
3.8048	23	12.1901
4.3145	24	12.1837
3.6633	25	12.1811
4.2401	26	12.1775
3.3954	27	12.1757
5.631	28	12.1720
3.8886	29	12.1714
4.3891	30	12.1686

Framework versions

PEFT 0.8.2
Transformers 4.38.0
Pytorch 2.2.1+cu121
Datasets 2.17.0
Tokenizers 0.15.2