SandLogicTechnologies commited on
Commit
36bccec
·
verified ·
1 Parent(s): 440f8b4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +135 -0
README.md ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
+ tags:
8
+ - SLM
9
+ - Conversational
10
+ ---
11
+ # SandLogic Technology - Quantized SmolLM-1.7B-Instruct Models
12
+
13
+ ## Model Description
14
+
15
+ We have quantized the SmolLM-1.7B-Instruct model into three variants:
16
+
17
+ 1. Q5_KM
18
+ 2. Q4_KM
19
+ 3. IQ4_XS
20
+
21
+ These quantized models offer improved efficiency while maintaining performance.
22
+
23
+ Discover our full range of quantized language models by visiting our [SandLogic Lexicon](https://github.com/sandlogic/SandLogic-Lexicon) GitHub. To learn more about our company and services, check out our website at [SandLogic](https://www.sandlogic.com).
24
+
25
+ ## Original Model Information
26
+
27
+ - **Name**: SmolLM-1.7B-Instruct
28
+ - **Model Type**: Small language model
29
+ - **Parameters**: 1.7 billion
30
+ - **Training Data**: SmolLM-Corpus (curated high-quality educational and synthetic data)
31
+
32
+ ## Model Capabilities
33
+
34
+ SmolLM-1.7B-Instruct is designed for various natural language processing tasks, with capabilities including:
35
+
36
+ - General knowledge question answering
37
+ - Creative writing
38
+ - Basic Python programming
39
+
40
+ ## Finetuning Details
41
+
42
+ The model was finetuned on a mixture of datasets, including:
43
+
44
+ - 2k simple everyday conversations generated by llama3.1-70B
45
+ - Magpie-Pro-300K-Filtered
46
+ - StarCoder2-Self-OSS-Instruct
47
+ - A small subset of OpenHermes-2.5
48
+
49
+ ## Limitations
50
+
51
+ - English language only
52
+ - May struggle with arithmetic, editing tasks, and complex reasoning
53
+ - Generated content may not always be factually accurate or logically consistent
54
+ - Potential biases from training data
55
+
56
+ ## Intended Use
57
+
58
+ 1. **Educational Assistance**: Helping students with general knowledge questions and basic programming concepts.
59
+ 2. **Creative Writing Aid**: Assisting in generating ideas or outlines for creative writing projects.
60
+ 3. **Conversational AI**: Powering chatbots for simple, everyday conversations.
61
+ 4. **Code Completion**: Providing suggestions for basic Python programming tasks.
62
+ 5. **General Knowledge Queries**: Answering straightforward questions on various topics.
63
+
64
+ ## Model Variants
65
+
66
+ We offer three quantized versions of the SmolLM-1.7B-Instruct model:
67
+
68
+ 1. **Q5_KM**: 5-bit quantization using the KM method
69
+ 2. **Q4_KM**: 4-bit quantization using the KM method
70
+ 3. **IQ4_XS**: 4-bit quantization using the IQ4_XS method
71
+
72
+ These quantized models aim to reduce model size and improve inference speed while maintaining performance as close to the original model as possible.
73
+
74
+ ## Usage
75
+
76
+ ```bash
77
+ pip install llama-cpp-python
78
+ ```
79
+ Please refer to the llama-cpp-python [documentation](https://llama-cpp-python.readthedocs.io/en/latest/) to install with GPU support.
80
+
81
+ ### Basic Text Completion
82
+ Here's an example demonstrating how to use the high-level API for basic text completion:
83
+
84
+ ```bash
85
+ from llama_cpp import Llama
86
+
87
+ llm = Llama(
88
+ model_path="./models/SmolLM-1.7B-Instruct.Q5_K_M.gguf",
89
+ verbose=False,
90
+ # n_gpu_layers=-1, # Uncomment to use GPU acceleration
91
+ # n_ctx=2048, # Uncomment to increase the context window
92
+ )
93
+
94
+ output = llm.create_chat_completion(
95
+ messages = [
96
+ {"role": "system", "content": "You're an AI assistant who help the user to answer his questions"},
97
+ {
98
+ "role": "user",
99
+ "content": "What is the capital of France."
100
+ }
101
+ ]
102
+ )
103
+
104
+ print(output["choices"][0]['message']['content'])
105
+ ```
106
+
107
+ ## Download
108
+ You can download `Llama` models in `gguf` format directly from Hugging Face using the `from_pretrained` method. This feature requires the `huggingface-hub` package.
109
+
110
+ To install it, run: `pip install huggingface-hub`
111
+
112
+ ```bash
113
+ from llama_cpp import Llama
114
+
115
+ llm = Llama.from_pretrained(
116
+ repo_id="SandLogicTechnologies/SmolLM-1.7B-Instruct-GGUF",
117
+ filename="*SmolLM-1.7B-Instruct.Q5_K_M.gguf",
118
+ verbose=False
119
+ )
120
+ ```
121
+ By default, from_pretrained will download the model to the Hugging Face cache directory. You can manage installed model files using the huggingface-cli tool.
122
+
123
+
124
+
125
+ ## Acknowledgements
126
+
127
+ We thank the original developers of SmolLM for their contributions to the field of small language models.
128
+ Special thanks to Georgi Gerganov and the entire llama.cpp development team for their outstanding contributions.
129
+
130
+
131
+ ## Contact
132
+
133
+ For any inquiries or support, please contact us at [email protected] or visit our [support page](https://www.sandlogic.com).
134
+
135
+