File size: 4,345 Bytes

---
license: apache-2.0
datasets:
- ZharfaTech/ZharfaTech-OpenAssistant-Guanaco-Persian-Farsi
- ZharfaTech/ZharfaTech-Open-Platypus-Persian-Farsi
language:
- fa
- en
pipeline_tag: text-generation
---
# ZharfaOpen: A State-of-the-Art Persian Language LLM

![image/webp](https://cdn-uploads.huggingface.co/production/uploads/6466b4df060756d2854a395c/5gkEu63QY8Gx1OQbgag6p.webp)

## Model Description
ZharfaOpen is an open-source large language model fine-tuned from the Mistral-7B architecture. Tailored specifically for the Persian language, ZharfaOpen excels in understanding and generating responses for conversations and instruction-following tasks. As the current state-of-the-art (SOTA) in Persian open-source LLMs, ZharfaOpen represents an advancement in natural language processing capabilities for Persian language speakers and developers.

## Example Outputs
Below are example outputs demonstrating ZharfaOpen's capabilities in conversation and instruction following:

**Example 1: Conversation**
- Input: "سلام، خوبی؟"
- Output: "سلام، خوشحالم که با شما صحبت  می کنم. چطور می توانم به شما کمک کنم؟"

**Example 2: Instruction Following**
- Input: "مشتق تابع x*sin(x) را به کمک قانون زنجیره محاسبه کن و تعیین کن که چیست؟"
- Output: "برای محاسبه مشتق تابع x*sin(x) از قاعده زنجیره، ابتدا باید مشتق هر دو عضو توابع را محاسبه کرده و سپس آنها را ضرب نمایید.

1. مشتق توابع x:
   مشتق تابع x برابر با ۱ است.

2. مشتق توابع sin(x):
   مشتق تابع sin(x) برابر با cos(x) است.

حال می‌توانیم مشتق تابع x*sin(x) را با ضرب کردن مشتقات هر دوی آن‌ها بدست بیاوریم:

(x * sin(x))' = x * sin'(x) + sin(x) * (x')
                     = x * cos(x) + sin(x) * 1
                     = x * cos(x) + sin(x)

بنابراین، مشتق تابع x*sin(x) برابر با x * cos(x) + sin(x) است."

## Benchmark Results
To be included.

## Prompt Format
ZharfaOpen supports the ChatML format for prompting, enabling users to structure their prompts effectively for conversation and instruction-following tasks. Here's an example prompt format:

```xml
<|im_start|>system 
تو یک مدل زبانی بزرگ هستی به اسم «ژرفا» که به سوالات کاربر به درستی، و با دقت پاسخ می‌دهی
<|im_end|> 
<|im_start|>user 
لطفا توضیح دهید چگونه یک ایمیل حرفه‌ای بنویسیم.
<|im_end|> 
<|im_start|>assistant 
```

## Inference Code
Here is a simple Python example for using ZharfaOpen with Hugging Face Transformers:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ZharfaTech/ZharfaOpen"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

inputs = tokenizer.encode("YOUR_PROMPT_HERE", return_tensors="pt")
outputs = model.generate(inputs, max_length=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

Replace `"YOUR_PROMPT_HERE"` with your input in Persian.

## Bias, Risks, and Limitations
While ZharfaOpen is a significant step forward in Persian LLM technology, users should be aware of potential biases and limitations inherent in any language model. ZharfaOpen has been trained on diverse datasets, but it may still reflect or amplify societal biases present in the training data. Users are encouraged to critically evaluate outputs and consider these factors in their applications.

## How to Cite
If you use ZharfaOpen in your research or application, please cite it as follows:

```bibtex
@misc{zharfaopen2024,
  title={ZharfaOpen: A State-of-the-Art Persian Language LLM},
  author={ZharfaTech Team},
  year={2024},
  note={Model fine-tuned over Mistral-7B for Persian language tasks}
}
```

ZharfaOpen is part of ZharfaTech's commitment to advancing language technologies for the Persian-speaking world. We encourage the community to engage with us in improving and expanding the capabilities of ZharfaOpen.

---

For more information or to contribute to the project, please visit our [Website](https://zahrfa.tech) or contact us directly.