ruslanmv's picture
Update README.md
15584a6 verified
|
raw
history blame
2.54 kB
metadata
language:
  - en
  - it
license: apache-2.0
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - llama
  - trl
  - sft

Meta LLaMA 3.1 8B 4-bit Finetuned Model

This model is a fine-tuned version of Meta-Llama-3.1-8B, developed by ruslanmv for text generation tasks. It leverages 4-bit quantization, making it more efficient for inference while maintaining strong performance in natural language generation.


Model Details

  • Base Model: unsloth/meta-llama-3.1-8b-bnb-4bit
  • Finetuned by: ruslanmv
  • Language: English
  • License: Apache 2.0
  • Tags:
    • text-generation-inference
    • transformers
    • unsloth
    • llama
    • trl
    • sft

Model Usage

Installation

To use this model, you will need to install the necessary libraries:

pip install transformers accelerate bitsandbytes

Loading the Model in Python

Here’s an example of how to load this fine-tuned model using Hugging Face's transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load the model and tokenizer
model_name = "Meta-Llama-3.1-8B-Text-to-SQL-4bit"

# Ensure you have the right device setup
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load the model and tokenizer from the Hugging Face Hub
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Example usage
input_text = "Recupera il conteggio di tutte le righe nella tabella table1"
inputs = tokenizer(input_text, return_tensors="pt").to(device)

# Generate output text
outputs = model.generate(**inputs, max_length=50)

# Decode and print the generated text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Model Features

  • Text Generation: This model is fine-tuned to generate coherent and contextually accurate text based on the provided input.
  • Efficiency: Using 4-bit quantization with the bitsandbytes library, it optimizes memory and inference performance.

License

This model is licensed under the Apache 2.0 License. You are free to use, modify, and distribute this model, provided that you comply with the license terms.

Acknowledgments

This model was fine-tuned by ruslanmv based on the original work of unsloth and the meta-llama-3.1-8b-bnb-4bit model.