base_model: unsloth/gemma-2-9b-bnb-4bit
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- gemma2
- trl
Uploaded model
- Developed by: helixx999
- License: apache-2.0
- Finetuned from model : unsloth/gemma-2-9b-bnb-4bit
This is gemma2 trained on semeval restaurant data 2014 by Harsh Jain.
trainer = SFTTrainer( model = model, tokenizer = tokenizer, train_dataset = dataset, dataset_text_field = "text_new", max_seq_length = max_seq_length, dataset_num_proc = 2, packing = False, # Can make training 5x faster for short sequences. args = TrainingArguments( per_device_train_batch_size = 2, gradient_accumulation_steps = 4, warmup_steps = 6, #Previous 5 #num_train_epochs = 1, # Set this for 1 full training run. max_steps = 60, #learning_rate = 2e-4, learning_rate = 1e-4, fp16 = not is_bfloat16_supported(), bf16 = is_bfloat16_supported(), logging_steps = 1, optim = "adamw_8bit", weight_decay = 0.01, lr_scheduler_type = "linear", seed = 3407, output_dir = "./tensorLog", report_to="wandb" ), )