helixx999's picture
Update README.md
44dadac verified
|
raw
history blame
1.4 kB
metadata
base_model: unsloth/gemma-2-9b-bnb-4bit
language:
  - en
license: apache-2.0
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - gemma2
  - trl

Uploaded model

  • Developed by: helixx999
  • License: apache-2.0
  • Finetuned from model : unsloth/gemma-2-9b-bnb-4bit

This is gemma2 trained on semeval restaurant data 2014 by Harsh Jain.

trainer = SFTTrainer( model = model, tokenizer = tokenizer, train_dataset = dataset, dataset_text_field = "text_new", max_seq_length = max_seq_length, dataset_num_proc = 2, packing = False, # Can make training 5x faster for short sequences. args = TrainingArguments( per_device_train_batch_size = 2, gradient_accumulation_steps = 4, warmup_steps = 6, #Previous 5 #num_train_epochs = 1, # Set this for 1 full training run. max_steps = 60, #learning_rate = 2e-4, learning_rate = 1e-4, fp16 = not is_bfloat16_supported(), bf16 = is_bfloat16_supported(), logging_steps = 1, optim = "adamw_8bit", weight_decay = 0.01, lr_scheduler_type = "linear", seed = 3407, output_dir = "./tensorLog", report_to="wandb" ), )