Triangle104 commited on
Commit
2f998ba
·
verified ·
1 Parent(s): 31d4b30

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -0
README.md CHANGED
@@ -18,6 +18,31 @@ library_name: transformers
18
  This model was converted to GGUF format from [`tiiuae/Falcon3-Mamba-7B-Instruct`](https://huggingface.co/tiiuae/Falcon3-Mamba-7B-Instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
  Refer to the [original model card](https://huggingface.co/tiiuae/Falcon3-Mamba-7B-Instruct) for more details on the model.
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ## Use with llama.cpp
22
  Install llama.cpp through brew (works on Mac and Linux)
23
 
 
18
  This model was converted to GGUF format from [`tiiuae/Falcon3-Mamba-7B-Instruct`](https://huggingface.co/tiiuae/Falcon3-Mamba-7B-Instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
  Refer to the [original model card](https://huggingface.co/tiiuae/Falcon3-Mamba-7B-Instruct) for more details on the model.
20
 
21
+ ---
22
+ Model details:
23
+ -
24
+ Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B.
25
+
26
+ This repository contains the Falcon3-Mamba-7B-Instruct. It achieves, compared to similar SSM-based models of the same size, state of art results (at release's time) on reasoning, language understanding, instruction following, code and mathematics tasks. Falcon3-Mamba-7B-Instruct supports a context length up to 32K and was mainly trained on english corpus.
27
+
28
+ Model Details
29
+
30
+ Architecture (same as Falcon-Mamba-7b)
31
+
32
+ Mamba1 based causal decoder only architecture trained on a causal language modeling task (i.e., predict the next token).
33
+ 64 decoder blocks
34
+ width: 4096
35
+ state_size: 16
36
+ 32k context length
37
+ 65k vocab size
38
+
39
+ Continue Pretrained from Falcon-Mamba-7b, with another 1500 Gigatokens of data consisting of web, code, STEM and high quality data.
40
+ Postrained on 1.2 million samples of STEM, conversations, code, and safety.
41
+ Developed by Technology Innovation Institute
42
+ License: TII Falcon-LLM License 2.0
43
+ Model Release Date: December 2024
44
+
45
+ ---
46
  ## Use with llama.cpp
47
  Install llama.cpp through brew (works on Mac and Linux)
48