helixx999's picture
Update README.md
8783b54 verified
|
raw
history blame
No virus
1.07 kB
metadata
base_model: unsloth/gemma-2-9b-bnb-4bit
language:
  - en
license: apache-2.0
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - gemma2
  - trl

Uploaded model

  • Developed by: helixx999 (Harsh Jain)
  • License: apache-2.0
  • Finetuned from model : unsloth/gemma-2-9b-bnb-4bit

This is gemma2 trained on semeval restaurant data 2014 using unsloth framework.

Training Parameters:

    per_device_train_batch_size = 2,
    gradient_accumulation_steps = 4,
    warmup_steps = 6, #Previous 5
    max_steps = 60,
    #learning_rate = 2e-4,
    learning_rate = 1e-4,
    fp16 = not is_bfloat16_supported(),
    bf16 = is_bfloat16_supported(),
    logging_steps = 1,
    optim = "adamw_8bit",
    weight_decay = 0.01,
    lr_scheduler_type = "linear",
    seed = 3407,
    output_dir = "./tensorLog",
    report_to="wandb"