Meta-LLama3-Instruct-Arabic

Meta-LLama3-Instruct-Arabic is a fine-tuned version of Meta's LLaMa model, specialized for Arabic language tasks. This model has been designed for a variety of NLP tasks including text generation,and language comprehension in Arabic.

Model Details

  • Model Name: Meta-LLama3-Instruct-Arabic
  • Base Model: LLaMa
  • Languages: Arabic
  • Tasks: Text Generation,Language Understanding
  • Quantization: [Specify if it’s quantized, e.g., 4-bit quantization with bitsandbytes, or float32]

Installation

To use this model, you need the unsloth andtransformers library from Hugging Face. You can install it as follows:

! pip install transformers bitsandbytes

how to use:

from transformers import AutoTokenizer, AutoModelForCausalLM
from IPython.display import Markdown
import textwrap 

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("MahmoudIbrahim/Meta-LLama3-Instruct-Arabic")
model = AutoModelForCausalLM.from_pretrained("MahmoudIbrahim/Meta-LLama3-Instruct-Arabic",load_in_4bit =True)


alpaca_prompt = """فيما يلي تعليمات تصف مهمة، إلى جانب مدخل يوفر سياقاً إضافياً. اكتب استجابة تُكمل الطلب بشكل مناسب.

### التعليمات:
{}

### الاستجابة:
{}"""

# Format the prompt with instruction and an empty output placeholder
formatted_prompt = alpaca_prompt.format(
    "ماذا تعرف عن الحضاره المصريه"   ,  # instruction
    ""  # Leave output blank for generation
)

# Tokenize the formatted string directly
input_ids = tokenizer.encode(formatted_prompt, return_tensors="pt")  # Use 'cuda' if you want to run on GPU

def to_markdown(text):
    text = text.replace('•','*')
    return Markdown(textwrap.indent(text, '>', predicate=lambda _: True))

# Generate text
output = model.generate(
    input_ids,
    max_length=128,            # Adjust max length as needed
    num_return_sequences=1,     # Number of generated responses
    no_repeat_ngram_size=2,     # Prevent repetition
    top_k=50,                   # Filter to top-k tokens
    top_p=0.9,                  # Use nucleus sampling
    temperature=0.7 ,            # Control creativity level
  
)

generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
to_markdown(generated_text)
Downloads last month
45
Safetensors
Model size
4.65B params
Tensor type
FP16
·
F32
·
U8
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train MahmoudIbrahim/Meta-LLama3-Instruct-Arabic