ZeroXClem commited on
Commit
034c330
1 Parent(s): 1b7b089

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +157 -12
README.md CHANGED
@@ -2,25 +2,85 @@
2
  license: apache-2.0
3
  tags:
4
  - merge
5
- - mergekit
6
- - lazymergekit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - Chainbase-Labs/Theia-Llama-3.1-8B-v1
8
  - EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
9
  - aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
10
  - DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
 
 
 
 
 
 
11
  ---
12
 
13
  # ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B
14
 
15
- ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
16
- * [Chainbase-Labs/Theia-Llama-3.1-8B-v1](https://huggingface.co/Chainbase-Labs/Theia-Llama-3.1-8B-v1)
17
- * [EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO](https://huggingface.co/EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO)
18
- * [aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored](https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored)
19
- * [DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst](https://huggingface.co/DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst)
20
 
21
- ## 🧩 Configuration
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  ```yaml
 
24
  models:
25
  - model: Chainbase-Labs/Theia-Llama-3.1-8B-v1
26
  parameters:
@@ -40,10 +100,95 @@ models:
40
  weight: 0.1
41
 
42
  merge_method: ties
43
- base_model: DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst # Using the ldm_soup as a base for optimal diffusion of strengths
44
  dtype: bfloat16
45
  parameters:
46
- normalize: true # Ensuring consistent scaling of layer outputs
47
- out_dtype: float16 # Maintaining efficiency and precision
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
- ```
 
2
  license: apache-2.0
3
  tags:
4
  - merge
5
+ - model_fusion
6
+ - TIES
7
+ - Llama3.1
8
+ - crypto
9
+ - blockchain
10
+ - coding_assistant
11
+ - creative_writing
12
+ - roleplaying
13
+ - uncensored
14
+ - latent_diffusion
15
+ - long_context
16
+ - agentic_AI
17
+ - multi_domain
18
+ - research
19
+ - instruction-following
20
+ - technical_reasoning
21
+ - task_generalization
22
+ - AI_tools
23
+ - GPT
24
+ base_model:
25
  - Chainbase-Labs/Theia-Llama-3.1-8B-v1
26
  - EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
27
  - aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
28
  - DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
29
+ datasets:
30
+ - CoinMarketCap
31
+ - blockchain_projects
32
+ - agentic_code_DPO
33
+ libraries: transformers
34
+ library_name: transformers
35
  ---
36
 
37
  # ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B
38
 
39
+ **Architecture:** Llama 3.1 - 8B
40
+ **Proposed Name:** Llama3.1-TheiaFire-DarkFusion-8B
41
+ **Merge Method:** TIES
42
+ **Merge Date:** 10/25/2024
43
+ **License:** Apache 2.0
44
 
45
+ ---
46
+
47
+ ## Model Overview
48
+
49
+ The **Llama3.1-TheiaFire-DarkFusion-8B** is a highly specialized fusion of four cutting-edge models, meticulously combined to provide an exceptional balance of technical reasoning, creativity, and uncensored freedom for a variety of use cases. Whether you need advanced coding assistance, blockchain insights, creative roleplaying, or general-purpose AI capabilities, this model delivers state-of-the-art results.
50
+
51
+ This model was merged using the **TIES** merge method to ensure optimal blending of layer weights and parameter configurations, resulting in a model that excels in multiple domains.
52
+
53
+ ---
54
+
55
+ ## Model Components
56
+
57
+ The following models were merged to create **Llama3.1-TheiaFire-DarkFusion-8B**:
58
+
59
+ 1. **[Theia-Llama-3.1-8B-v1](https://huggingface.co/Chainbase-Labs/Theia-Llama-3.1-8B-v1)**
60
+ - **Purpose:** Balances technical vision and crypto capabilities.
61
+ - **Training Focus:** This model specializes in blockchain data and was trained on a large dataset of crypto whitepapers, research reports, and market data.
62
+ - **Unique Feature:** Fine-tuned using LoRA for optimized crypto-specific performance.
63
+
64
+ 2. **[EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO](https://huggingface.co/EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO)**
65
+ - **Purpose:** Specialized in agentic reasoning and advanced coding tasks.
66
+ - **Unique Feature:** This model is equipped with a 128K context window and comes with built-in tools for ReAct, calculator, search, and more.
67
+
68
+ 3. **[aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored](https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored)**
69
+ - **Purpose:** Provides uncensored, creativity-driven responses ideal for writing, role-playing, and in-depth conversations.
70
+ - **Unique Feature:** Uncensored nature allows for open exploration of creative writing and darker, more complex roleplay scenarios.
71
+
72
+ 4. **[DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst](https://huggingface.co/DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst)**
73
+ - **Purpose:** Enhances performance with latent diffusion model blending.
74
+ - **Unique Feature:** This model builds upon Llama-3.1’s foundation and improves unseen task generalization with latent diffusion.
75
+
76
+ ---
77
+
78
+ ## Model Specifications
79
+
80
+ ### Merge Configuration
81
 
82
  ```yaml
83
+ # Llama3.1-TheiaFire-DarkFusion-8B Merge Configuration
84
  models:
85
  - model: Chainbase-Labs/Theia-Llama-3.1-8B-v1
86
  parameters:
 
100
  weight: 0.1
101
 
102
  merge_method: ties
103
+ base_model: Theia-Llama-3.1-8B-v1
104
  dtype: bfloat16
105
  parameters:
106
+ normalize: true
107
+ out_dtype: float16
108
+ ```
109
+
110
+ ---
111
+
112
+ ## Intended Use Cases
113
+
114
+ 1. **Crypto Analysis & Blockchain Projects**
115
+ - Leverages data from CoinMarketCap and research reports for in-depth analysis of blockchain projects and crypto markets.
116
+ - Ideal for creating blockchain-related content or automating crypto data analysis.
117
+
118
+ 2. **Advanced Coding Assistant**
119
+ - Built-in support for agentic behavior such as reasoning and action, making it perfect for AI-driven coding assistance.
120
+ - Handles large-scale coding projects with tools like search and calculator integration.
121
+
122
+ 3. **Creative Writing & Roleplay**
123
+ - **Uncensored output** allows for rich, expressive writing ideal for novels, creative pieces, or roleplay scenarios.
124
+ - Capable of producing nuanced, emotionally complex character responses in roleplaying games or interactive storytelling.
125
+
126
+ 4. **Unseen Task Generalization**
127
+ - With the latent diffusion capabilities, this model can handle unseen tasks by learning weight distributions in an adaptive manner, improving performance on novel datasets or tasks.
128
+
129
+ ---
130
+
131
+ ## Performance
132
+
133
+ - The model has shown significant improvements in **multi-domain reasoning**, **code generation**, and **unconstrained creative output**.
134
+ - **Enhanced task generalization** due to latent diffusion model blending techniques.
135
+
136
+ ---
137
+
138
+ ## Model Capabilities
139
+
140
+ - **Context Window**: 128K (capable of handling long-form tasks like novel writing and in-depth research).
141
+ - **Agentic Tools**: Built-in tools like search and calculator.
142
+ - **Safety**: While uncensored, responsible prompting is encouraged to ensure the best user experience and ethical usage.
143
+
144
+ ---
145
+
146
+ ## Usage
147
+
148
+ This model can be used in popular AI libraries like **Transformers** and **Langchain**. Below is a basic setup using **Transformers**:
149
+
150
+ ### Example Code
151
+
152
+ ```python
153
+ import transformers
154
+ import torch
155
+
156
+ model_id = "Llama3.1-TheiaFire-DarkFusion-8B"
157
+
158
+ pipeline = transformers.pipeline(
159
+ "text-generation",
160
+ model=model_id,
161
+ model_kwargs={"torch_dtype": torch.bfloat16},
162
+ device_map="auto",
163
+ )
164
+
165
+ messages = [
166
+ {"role": "system", "content": "You are an AI assistant skilled in coding and creative writing."},
167
+ {"role": "user", "content": "Please write me a Python function to compute the factorial of a number."}
168
+ ]
169
+
170
+ outputs = pipeline(messages, max_new_tokens=256)
171
+ print(outputs[0]["generated_text"][-1])
172
+ ```
173
+
174
+ ---
175
+
176
+ ## Limitations
177
+
178
+ - **Uncensored Output**: While this model offers creative freedom, it may produce content that could be considered inappropriate or unsuitable for certain contexts.
179
+ - **Bias**: As with all language models, this one may reflect inherent biases in the training data. Users are encouraged to review and edit the outputs before use.
180
+
181
+ ---
182
+
183
+ ## Acknowledgments
184
+
185
+ This model is a collective effort, combining the groundbreaking work from:
186
+
187
+ - **Chainbase Labs** (for Theia-Llama)
188
+ - **EpistemeAI** (for Fireball Meta-Llama)
189
+ - **Aifeifei798** (for DarkIdol)
190
+ - **DeepAutoAI** (for LDM Soup)
191
+
192
+ Special thanks to the open-source community and the developers who contributed to the training and fine-tuning of these models.
193
 
194
+ ---