SandLogicTechnologies commited on
Commit
02f4b94
·
verified ·
1 Parent(s): 8d7ea65

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +110 -0
README.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ base_model:
5
+ - allenai/Llama-3.1-Tulu-3-8B
6
+ tags:
7
+ - llama
8
+ - math
9
+ - conersational
10
+ ---
11
+ # Quantized Llama-3.1-Tulu-3-8B Models
12
+
13
+ This repository contains Q4_KM and Q5_KM quantized versions of the [allenai/Llama-3.1-Tulu-3-8B](https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B) model. These quantized variants provide efficient alternatives while maintaining the core capabilities of Tülu3, a leading instruction-following model family.
14
+
15
+ ## Model Overview
16
+
17
+ - **Original Model**: Llama-3.1-Tulu-3-8B
18
+ - **Quantized Versions**:
19
+ - Q4_KM (4-bit quantization)
20
+ - Q5_KM (5-bit quantization)
21
+ - **Base Architecture**: 8B parameter instruction-following model
22
+ - **Developer**: Allen Institute for AI
23
+ - **License**: Llama 3.1 Community License Agreement
24
+ - **Language**: Primarily English
25
+ - **Finetuned From**: allenai/Llama-3.1-Tulu-3-8B-DPO
26
+
27
+ ## Quantization Details
28
+
29
+ ### Q4_KM Version
30
+ - Model size reduction: ~75% smaller than original
31
+ - Memory footprint: 4.92 GB
32
+ - Optimized for deployment in resource-constrained environments
33
+ - Maintains core functionality with minimal performance impact
34
+
35
+ ### Q5_KM Version
36
+ - Model size reduction: ~69% smaller than original
37
+ - Memory footprint: 5.73 GB
38
+ - Higher precision than Q4_KM
39
+ - Better preservation of model quality
40
+
41
+ ## Key Features
42
+
43
+ Both quantized versions maintain Tülu3's state-of-the-art performance on:
44
+ - Instruction following tasks
45
+ - Mathematical reasoning (MATH dataset)
46
+ - Grade school math problems (GSM8K)
47
+ - General instruction following (IFEval)
48
+ - Chat-based interactions
49
+ - Complex reasoning tasks
50
+
51
+ ## Usage
52
+
53
+ ```python
54
+ from llama_cpp import Llama
55
+
56
+ llm = Llama(
57
+ model_path="./models/7B/Llama-3.1-Tulu-3-8B.gguf",
58
+ verbose=False,
59
+ # n_gpu_layers=-1, # Uncomment to use GPU acceleration
60
+ # n_ctx=2048, # Uncomment to increase the context window
61
+ )
62
+
63
+ output = llm.create_chat_completion(
64
+ messages = [
65
+ {"role": "system", "content": "You're an AI assistant who help in answering user question"},
66
+ {
67
+ "role": "user",
68
+ "content": "Write an python code to find prime number"
69
+ }
70
+ ]
71
+ )
72
+
73
+ print(output["choices"][0]['message']['content'])
74
+ ```
75
+
76
+ ## Training Data
77
+
78
+ The model was trained on a diverse mix of:
79
+ - Publicly available datasets
80
+ - Synthetic data
81
+ - Human-created datasets
82
+
83
+ ## Bias, Risks, and Limitations
84
+
85
+ These quantized models inherit the limitations of the original Tülu3 model:
86
+ - Limited safety training compared to models with active filtering
87
+ - Can produce problematic outputs, especially when prompted to do so
88
+ - Unknown composition of the base Llama 3.1 training corpus
89
+ - Additional considerations for quantized versions:
90
+ - Slight degradation in performance compared to full-precision model
91
+ - May show increased variance in mathematical reasoning tasks
92
+ - Q4_KM may exhibit more pronounced quality loss in complex scenarios
93
+
94
+ ## Recommended Use Cases
95
+
96
+ - Research and development
97
+ - Educational applications
98
+ - Resource-constrained deployments
99
+ - Edge computing scenarios
100
+ - Prototyping and testing
101
+ - Applications requiring faster inference
102
+
103
+
104
+ ## Acknowledgments
105
+
106
+ These quantized models are based on the work of the Allen Institute for AI and the Llama 3.1 team. Special thanks to Georgi Gerganov and the entire llama.cpp development team for their outstanding contributions.
107
+
108
+ ## Contact
109
+
110
+ For any inquiries or support, please contact us at [email protected] or visit our [Website](https://www.sandlogic.com/).