File size: 2,962 Bytes
6e8f421
 
 
 
f20c925
 
31fcafb
 
 
 
 
 
 
 
ba9c102
 
 
 
529b77d
31fcafb
 
529b77d
52f346c
529b77d
52f346c
529b77d
eb85aed
52f346c
529b77d
52f346c
 
 
 
 
 
 
 
 
 
 
5e0c393
b632085
 
 
 
 
 
 
5e0c393
52f346c
 
 
f20c925
 
e09c0eb
 
f20c925
 
 
 
52f346c
 
be1fbe6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
language:
- en
- fa
---

<p align="center">
  <picture>
    <img alt="Hugging Face Transformers Library" src="https://i.postimg.cc/VN4F7WRC/Untitled-design-modified.png" width="1000" height="450" style="max-width: 100%;">
  </picture>
</p>

<h4 align="center">
    <p>
        <a href="https://huggingface.co./aidal/Persian-Mistral-7B#model-description">Model description</a> |
        <a href="https://huggingface.co./aidal/Persian-Mistral-7B#example-output">Example output</a> |
        <a href="https://huggingface.co./aidal/Persian-Mistral-7B#banchmark-results">Banchmark results</a> |
        <a href="https://huggingface.co./aidal/Persian-Mistral-7B#how-to-use">How to use</a> |
        <a href="https://huggingface.co./aidal/Persian-Mistral-7B#training-and-finetuning">Training and finetuning</a>
    </p>
</h4>

----

# Model description

Jamba is a state-of-the-art, hybrid SSM-Transformer LLM. It delivers throughput gains over traditional Transformer-based models, while outperforming or matching the leading models of its size class on most common benchmarks.Jamba is the first production-scale Mamba implementation, which opens up interesting research and application opportunities. While this initial experimentation shows encouraging gains, we expect these to be further enhanced with future optimizations and explorations.This model card is for the base version of Jamba. It’s a pretrained, mixture-of-experts (MoE) generative text model, with 12B active parameters and a total of 52B parameters across all experts. It supports a 256K context length, and can fit up to 140K tokens on a single 80GB GPU.
----

# Example output:

**Example 1:**
- Input: "سلام، خوبی؟"
- Output: "سلام، خوشحالم که با شما صحبت  می کنم. چطور می توانم به شما کمک کنم؟"

**Example 2:**
- Input: "سلام، خوبی؟"
- Output: "سلام، خوشحالم که با شما صحبت  می کنم. چطور می توانم به شما کمک کنم؟"
----
# Banchmark results

| model         | dataset           | max_token | prompt | score   |
|---------------|-------------------|-----------|--------|---------|
| base-model-7b | ARC-easy-dev      | 2         | en-1   | 0.41929 |
| base-model-7b | ARC-easy-dev      | 80        | en-2   | 0.39122 |
| fa-model-7b   | ARC-easy-dev      | 80        | en-1   | 0.37894 |
| base-model-7b | ARC-challenge-dev | 80        | en-2   | 0.37123 |
| fa-model-7b   | ARC-challenge-dev | 80        | en-1   | 0.39298 |

----
# How to use

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("aidal/Persian-Mistral-7B")
model = AutoModelForCausalLM.from_pretrained("aidal/Persian-Mistral-7B")
input_text = "پایتخت ایران کجاست؟"
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
```
----
# Training and finetuning