silma-ai
/

SILMA-Kashif-2B-Instruct-v1.0

+---
+license: gemma
+library_name: transformers
+pipeline_tag: text-generation
+extra_gated_button_content: Acknowledge license
+tags:
+- rag
+language:
+- ar
+- en
+model-index:
+- name: SILMA-Kashif-2B-Instruct-v1.0
+  results:
+  - task:
+      type: text-generation
+    dataset:
+      name: SILMA RAGQA Benchmark Dataset V1.0
+      type: silma-ai/silma-rag-qa-benchmark-v1.0
+    metrics:
+    - name: SILMA RAGQA Benchmark Score
+      type: Average of Exact Match, BLEU, ROUGE, and BERTScore.
+      value: 0.347
+    source:
+      name: SILMA RAGQA Benchmark
+      url: https://huggingface.co/datasets/silma-ai/silma-rag-qa-benchmark-v1.0
+---
+## SILMA Kashif Model
+* **SILMA Kashif 2B Instruct v1.0** is the initial release within the SILMA Kashif Family of models, specifically designed for **RAG** (Retrieval-Augmented Generation) tasks
+* Kashif excels in a specific task, answering questions based on contextual pieces in both Arabic and English. In addition, the model is also capable of performing Entity Extraction tasks as a minor skill
+* SILMA Kashif 2B v1.0 stands out as the top-performing open model within the 3-9 billion parameter range based on our evaluations using [SILMA RAGQA Benchmark](https://huggingface.co/datasets/silma-ai/silma-rag-qa-benchmark-v1.0)
+* SILMA is built over the robust foundational models of Google Gemma, combining the strengths of both to provide you with unparalleled performance
+* SILMA is an open-weight model, free to use in accordance with our open license
+* Finally, the model comes with a context length of 12k
+## Model Skill and Capabilities
+The large language model underwent rigorous training to excel in performing a variety of skills:
+- The ability to answer general questions in Arabic and English
+- The ability to deal with short and long contexts
+- The ability to provide short and long answers effectively
+- The ability to answer complex numerical questions
+- The ability to answer questions based on tabular data
+- Answering multi-hop questions: The ability to answer a single question using pieces of data from multiple paragraphs
+- Negative rejection: The ability to identify and exclude inaccurate answers, and provide a more accurate statement such as "The answer cannot be found in the given context"
+- Multi-domains: The ability to answer questions based on texts from different fields such as finance, medical, legal, etc.
+- The ability to deal with ambiguous contexts
+- The ability to extract entities from text
+- Ability to deal with diverse and complex prompts
+## SILMA AI
+[silma.ai](https://silma.ai) is a leading Generative AI startup dedicated to empowering Arabic speakers with state-of-the-art AI solutions.
+### Usage
+Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with:
+```sh
+pip install -U transformers
+```
+Then, copy the snippet from the section that is relevant for your usecase.
+#### Running with the `pipeline` API
+```python
+import torch
+from transformers import pipeline
+pipe = pipeline(
+    "text-generation",
+    model="silma-ai/SILMA-Kashif-2B-Instruct-v1.0",
+    model_kwargs={"torch_dtype": torch.bfloat16},
+    device="cuda",  # replace with "mps" to run on a Mac device
+)
+messages = [
+    {"role": "user", "content": "اكتب رسالة تعتذر فيها لمديري في العمل عن الحضور اليوم لأسباب مرضية."},
+]
+outputs = pipe(messages, max_new_tokens=256)
+assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
+print(assistant_response)
+```
+- Response:
+```text
+السلام عليكم ورحمة الله وبركاته
+أودّ أن أعتذر عن عدم الحضور إلى العمل اليوم بسبب مرضي. أشعر بالسوء الشديد وأحتاج إلى الراحة. سأعود إلى العمل فور تعافيي.
+شكراً لتفهمكم.
+مع تحياتي،
+[اسمك]
+```
+### GPU Requirements
+The following are the minimum/recommended GPU requirements for running inference:
+* Recommended
+  * At least one GPU with a minimum of 24 GB of GPU memory
+  * Examples: Nvidia RTX 4090
+* Minimum
+  * At least one GPU with 8-12 GB of GPU memory
+  * Examples: Nvidia RTX 3070 or RTX 4070
+### Citation
+```none
+@article{silma_01_2025,
+    title={Silma},
+    url={https://www.silma.ai},
+    publisher={Silma},
+    author={Silma Team},
+    year={2025}
+}
+```
+## Usage and Limitations
+These models have certain limitations that users should be aware of.
+### Intended Usage
+* The model should only be used in question answering use-cases such as RAG
+* The model can also be used to extract entities from text
+### Limitations
+* Due to its small number of parameters, we have found the model not to be very strong in numercial and financial reasoning (Answeeing questions which requires calculation)