aidal
/

Persian-Mistral-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

aidal commited on Apr 6

Commit

529b77d

•

1 Parent(s): ba9c102

Update README.md

Files changed (1) hide show

README.md +10 -4

README.md CHANGED Viewed

@@ -8,23 +8,29 @@ language:
   <picture>
     <img alt="Hugging Face Transformers Library" src="https://i.postimg.cc/VN4F7WRC/Untitled-design-modified.png" width="1000" height="450" style="max-width: 100%;">
   </picture>
-  <br/>
-  <br/>
 </p>
 <h4 align="center">
     <p>
-        <b>English</b> |
         <a href="https://huggingface.co/aidal/Persian-Mistral-7B#model-description">Model description</a> |
         <a href="https://huggingface.co/aidal/Persian-Mistral-7B#example-output">Example output</a> |
         <a href="https://huggingface.co/aidal/Persian-Mistral-7B#banchmark-results">Banchmark results</a> |
         <a href="https://huggingface.co/aidal/Persian-Mistral-7B#how-to-use">How to use</a> |
-        <a href="https://huggingface.co/aidal/Persian-Mistral-7B#training-and-finetuning">Training and finetuning</a> |
     </p>
 </h4>
 ----
 # Model description
 ----
 # Example output:
 **Example 1:**

   <picture>
     <img alt="Hugging Face Transformers Library" src="https://i.postimg.cc/VN4F7WRC/Untitled-design-modified.png" width="1000" height="450" style="max-width: 100%;">
   </picture>
 </p>
 <h4 align="center">
     <p>
         <a href="https://huggingface.co/aidal/Persian-Mistral-7B#model-description">Model description</a> |
         <a href="https://huggingface.co/aidal/Persian-Mistral-7B#example-output">Example output</a> |
         <a href="https://huggingface.co/aidal/Persian-Mistral-7B#banchmark-results">Banchmark results</a> |
         <a href="https://huggingface.co/aidal/Persian-Mistral-7B#how-to-use">How to use</a> |
+        <a href="https://huggingface.co/aidal/Persian-Mistral-7B#training-and-finetuning">Training and finetuning</a>
     </p>
 </h4>
 ----
 # Model description
+>Jamba is a state-of-the-art, hybrid SSM-Transformer LLM. It delivers throughput gains over traditional Transformer-based models, while outperforming or matching the leading models of its size class on most common benchmarks.
+>Jamba is the first production-scale Mamba implementation, which opens up interesting research and application opportunities. While this initial experimentation shows encouraging gains, we expect these to be further enhanced with future optimizations and explorations.
+>This model card is for the base version of Jamba. It’s a pretrained, mixture-of-experts (MoE) generative text model, with 12B active parameters and a total of 52B parameters across all experts. It supports a 256K context length, and can fit up to 140K tokens on a single 80GB GPU.
 ----
 # Example output:
 **Example 1:**