Model Name: Llama 3 orca_mini_v6_8b-AWQ

Llama 3 orca_mini_v6_8b-AWQ is quantize version of pankajmathur/orca_mini_v6_8b-AWQ

"Obsessed with GenAI's potential? So am I ! Let's create together 🚀 https://www.linkedin.com/in/pankajam"

Example Usage

Here is the ChatML prompt format

<|im_start|>system
You are Orca Mini, a helpful AI assistant.<|im_end|>
<|im_start|>user
Hello Orca Mini, what can you do for me?<|im_end|>
<|im_start|>assistant

Below shows a code example on how to use this model

from transformers import AutoModel, AutoTokenizer
model_slug = "pankajmathur/orca_mini_v6_8b-AWQ"
model = AutoModel.from_pretrained(model_slug)
tokenizer = AutoTokenizer.from_pretrained(model_slug)

messages = [
    {"role": "system", "content": "You are Orca Mini, a helpful AI assistant."},
    {"role": "user", "content": "Hello Orca Mini, what can you do for me?"}
]

gen_input = tokenizer.apply_chat_template(messages, return_tensors="pt")
model.generate(**gen_input)

This model is governed by META LLAMA 3 COMMUNITY LICENSE AGREEMENT

Downloads last month
66
Safetensors
Model size
1.98B params
Tensor type
I32
·
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.