--- license: llama3.1 pipeline_tag: text-generation tags: - facebook - meta - pytorch - llama - llama-3 datasets: - Kushtrim/alpaca-cleaned-sq language: - sq --- # Kushtrim/Llama-3.1-8B-Instruct-bnb-4bit-shqip ## Model overview **Kushtrim/Llama-3.1-8B-Instruct-bnb-4bit-shqip** is a fine-tuned version of the [Llama 3.1 model](https://huggingface.co./meta-llama/Meta-Llama-3.1-8B-Instruct), specifically optimized for Albanian language tasks. This model is tailored to perform a variety of natural language processing tasks in Albanian, utilizing a quantized 4-bit precision to maintain efficiency and scalability while supporting extensive inference tasks. ## Model Details - **Model Name:** Kushtrim/Llama-3.1-8B-Instruct-bnb-4bit-shqip - **Base Model:** Llama 3.1 - **Model Size:** 8 billion parameters - **Quantization:** 4-bit precision (bnb) - **Language:** Albanian - **License:** [llama3.1](https://huggingface.co./meta-llama/Meta-Llama-3.1-8B-Instruct/resolve/main/LICENSE) ## Limitations - **Representation of Harms & Stereotypes:** Potential for biased outputs reflecting real-world societal biases. - **Inappropriate or Offensive Content:** Risk of generating content that may be offensive or inappropriate in certain contexts. - **Information Reliability:** Possibility of producing inaccurate or outdated information. - **Dataset Size:** The Albanian dataset used for fine-tuning was not very large, which may affect the model's performance and coverage. ## Intended Use - **Intended Use Cases:** This model is suitable for various NLP tasks in Albanian, including conversational AI, text generation, and language understanding. - **Out-of-scope Use:** This model should not be used in ways that violate laws, regulations, or ethical guidelines. It is also not intended for use in languages other than Albanian unless appropriately fine-tuned. ## Responsible AI Considerations Developers using this model should: - Evaluate and mitigate risks related to accuracy, safety, and fairness. - Ensure compliance with applicable laws and regulations. - Implement additional safeguards for high-risk scenarios and sensitive contexts. - Inform end-users that they are interacting with an AI system. - Use feedback mechanisms and contextual information grounding techniques (RAG) to enhance output reliability. ```python !pip3 install -U transformers peft accelerate bitsandbytes from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline import torch hf_token = "hf_...." torch.random.manual_seed(0) model = AutoModelForCausalLM.from_pretrained( "Kushtrim/Llama-3.1-8B-Instruct-bnb-4bit-shqip", device_map="cuda", torch_dtype="auto", trust_remote_code=True, token=hf_token, ) tokenizer = AutoTokenizer.from_pretrained("Kushtrim/Llama-3.1-8B-Instruct-bnb-4bit-shqip", token=hf_token) messages = [ {"role": "system", "content": "Je një asistent inteligjent shumë i dobishëm."}, {"role": "user", "content": "Identifiko emrat e personave në këtë artikull 'Majlinda Kelmendi (lindi më 9 maj 1991), është një xhudiste shqiptare nga Peja, Kosovë.'"}, ] pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, ) generation_args = { "max_new_tokens": 2048, "return_full_text": False, "temperature": 0.9, "do_sample": True, } prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False) output = pipe(prompt, **generation_args) print(output[0]['generated_text']) ``` ## Acknowledgements This model is built upon the Meta-Llama-3.1-8B-Instruct by leveraging its robust capabilities and further fine-tuning it for Albanian language tasks. Special thanks to the developers and researchers who contributed to the original Llama3.1.