Inference:

!pip install -q "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install -q --no-deps "xformers<0.0.26" trl peft accelerate bitsandbytes
from unsloth import FastLanguageModel
import torch
max_seq_length = 512
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "Hinglish-Project/llama-3-8b-English-to-Hinglish",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)
def pipe(prompt):
  alpaca_prompt = """### Instrucion: Translate given text to Hinglish Text:

### Input:
{}

### Response:
"""

  inputs = tokenizer(
      [
          alpaca_prompt.format(prompt),
      ], return_tensors = "pt").to("cuda")

  outputs = model.generate(**inputs, max_new_tokens = 2048, use_cache = True)
  raw_text = tokenizer.batch_decode(outputs)[0]
  return raw_text.split("### Response:\n")[1].split("<|end_of_text|>")[0]
text = "This is a fine-tuned Hinglish translation model using Llama 3."
pipe(text)
## yeh ek fine-tuned Hinglish translation model hai jisme Llama 3 ka use kiya gaya hai.

Uploaded model

  • Developed by: Hinglish-Project
  • License: apache-2.0
  • Finetuned from model : unsloth/llama-8b-bnb-4bit

This Llama3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
129
Safetensors
Model size
8.03B params
Tensor type
FP16
Β·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Hinglish-Project/llama-3-8b-English-to-Hinglish

Base model

unsloth/llama-3-8b
Quantized
(23)
this model
Quantizations
1 model

Dataset used to train Hinglish-Project/llama-3-8b-English-to-Hinglish

Spaces using Hinglish-Project/llama-3-8b-English-to-Hinglish 6