Uploaded model

Developed by: alibidaran
License: apache-2.0
Finetuned from model : unsloth/llama-3-8b-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Direct usage

from unsloth import FastLanguageModel
import torch

max_seq_length = 2048
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage.
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "alibidaran/LLAMA3_Mental_Health_Cosulting",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
prompt="""
I have many issues to address, my firends leave me alone becuase my hobbies are radically different from them
but I want to stay in touch with but they ignore me even they don't invite me to their parties, what do you recommend? """
instructions=f"<s>[INST] {prompt} [/INST]"
inputs = tokenizer(
[
    instructions
], return_tensors = "pt").to("cuda")

with torch.no_grad():
    outputs=model.generate(**inputs,max_new_tokens=500,do_sample=True,top_p=0.95,top_k=10,temperature=0.5)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

alibidaran
/

LLAMA3_Mental_Health_Consulting

Uploaded model

Direct usage

Model tree for alibidaran/LLAMA3_Mental_Health_Consulting