Text Generation
Transformers
Safetensors
llama
Merge
model_fusion
TIES
Llama3.1
crypto
blockchain
coding_assistant
creative_writing
roleplaying
uncensored
latent_diffusion
long_context
agentic_AI
multi_domain
research
instruction-following
technical_reasoning
task_generalization
AI_tools
GPT
conversational
text-generation-inference
Inference Endpoints
Update README.md
Browse files
README.md
CHANGED
@@ -2,25 +2,85 @@
|
|
2 |
license: apache-2.0
|
3 |
tags:
|
4 |
- merge
|
5 |
-
-
|
6 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
- Chainbase-Labs/Theia-Llama-3.1-8B-v1
|
8 |
- EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
|
9 |
- aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
|
10 |
- DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
# ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B
|
14 |
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
|
21 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
```yaml
|
|
|
24 |
models:
|
25 |
- model: Chainbase-Labs/Theia-Llama-3.1-8B-v1
|
26 |
parameters:
|
@@ -40,10 +100,95 @@ models:
|
|
40 |
weight: 0.1
|
41 |
|
42 |
merge_method: ties
|
43 |
-
base_model:
|
44 |
dtype: bfloat16
|
45 |
parameters:
|
46 |
-
normalize: true
|
47 |
-
out_dtype: float16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
|
49 |
-
|
|
|
2 |
license: apache-2.0
|
3 |
tags:
|
4 |
- merge
|
5 |
+
- model_fusion
|
6 |
+
- TIES
|
7 |
+
- Llama3.1
|
8 |
+
- crypto
|
9 |
+
- blockchain
|
10 |
+
- coding_assistant
|
11 |
+
- creative_writing
|
12 |
+
- roleplaying
|
13 |
+
- uncensored
|
14 |
+
- latent_diffusion
|
15 |
+
- long_context
|
16 |
+
- agentic_AI
|
17 |
+
- multi_domain
|
18 |
+
- research
|
19 |
+
- instruction-following
|
20 |
+
- technical_reasoning
|
21 |
+
- task_generalization
|
22 |
+
- AI_tools
|
23 |
+
- GPT
|
24 |
+
base_model:
|
25 |
- Chainbase-Labs/Theia-Llama-3.1-8B-v1
|
26 |
- EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
|
27 |
- aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
|
28 |
- DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
|
29 |
+
datasets:
|
30 |
+
- CoinMarketCap
|
31 |
+
- blockchain_projects
|
32 |
+
- agentic_code_DPO
|
33 |
+
libraries: transformers
|
34 |
+
library_name: transformers
|
35 |
---
|
36 |
|
37 |
# ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B
|
38 |
|
39 |
+
**Architecture:** Llama 3.1 - 8B
|
40 |
+
**Proposed Name:** Llama3.1-TheiaFire-DarkFusion-8B
|
41 |
+
**Merge Method:** TIES
|
42 |
+
**Merge Date:** 10/25/2024
|
43 |
+
**License:** Apache 2.0
|
44 |
|
45 |
+
---
|
46 |
+
|
47 |
+
## Model Overview
|
48 |
+
|
49 |
+
The **Llama3.1-TheiaFire-DarkFusion-8B** is a highly specialized fusion of four cutting-edge models, meticulously combined to provide an exceptional balance of technical reasoning, creativity, and uncensored freedom for a variety of use cases. Whether you need advanced coding assistance, blockchain insights, creative roleplaying, or general-purpose AI capabilities, this model delivers state-of-the-art results.
|
50 |
+
|
51 |
+
This model was merged using the **TIES** merge method to ensure optimal blending of layer weights and parameter configurations, resulting in a model that excels in multiple domains.
|
52 |
+
|
53 |
+
---
|
54 |
+
|
55 |
+
## Model Components
|
56 |
+
|
57 |
+
The following models were merged to create **Llama3.1-TheiaFire-DarkFusion-8B**:
|
58 |
+
|
59 |
+
1. **[Theia-Llama-3.1-8B-v1](https://huggingface.co/Chainbase-Labs/Theia-Llama-3.1-8B-v1)**
|
60 |
+
- **Purpose:** Balances technical vision and crypto capabilities.
|
61 |
+
- **Training Focus:** This model specializes in blockchain data and was trained on a large dataset of crypto whitepapers, research reports, and market data.
|
62 |
+
- **Unique Feature:** Fine-tuned using LoRA for optimized crypto-specific performance.
|
63 |
+
|
64 |
+
2. **[EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO](https://huggingface.co/EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO)**
|
65 |
+
- **Purpose:** Specialized in agentic reasoning and advanced coding tasks.
|
66 |
+
- **Unique Feature:** This model is equipped with a 128K context window and comes with built-in tools for ReAct, calculator, search, and more.
|
67 |
+
|
68 |
+
3. **[aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored](https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored)**
|
69 |
+
- **Purpose:** Provides uncensored, creativity-driven responses ideal for writing, role-playing, and in-depth conversations.
|
70 |
+
- **Unique Feature:** Uncensored nature allows for open exploration of creative writing and darker, more complex roleplay scenarios.
|
71 |
+
|
72 |
+
4. **[DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst](https://huggingface.co/DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst)**
|
73 |
+
- **Purpose:** Enhances performance with latent diffusion model blending.
|
74 |
+
- **Unique Feature:** This model builds upon Llama-3.1’s foundation and improves unseen task generalization with latent diffusion.
|
75 |
+
|
76 |
+
---
|
77 |
+
|
78 |
+
## Model Specifications
|
79 |
+
|
80 |
+
### Merge Configuration
|
81 |
|
82 |
```yaml
|
83 |
+
# Llama3.1-TheiaFire-DarkFusion-8B Merge Configuration
|
84 |
models:
|
85 |
- model: Chainbase-Labs/Theia-Llama-3.1-8B-v1
|
86 |
parameters:
|
|
|
100 |
weight: 0.1
|
101 |
|
102 |
merge_method: ties
|
103 |
+
base_model: Theia-Llama-3.1-8B-v1
|
104 |
dtype: bfloat16
|
105 |
parameters:
|
106 |
+
normalize: true
|
107 |
+
out_dtype: float16
|
108 |
+
```
|
109 |
+
|
110 |
+
---
|
111 |
+
|
112 |
+
## Intended Use Cases
|
113 |
+
|
114 |
+
1. **Crypto Analysis & Blockchain Projects**
|
115 |
+
- Leverages data from CoinMarketCap and research reports for in-depth analysis of blockchain projects and crypto markets.
|
116 |
+
- Ideal for creating blockchain-related content or automating crypto data analysis.
|
117 |
+
|
118 |
+
2. **Advanced Coding Assistant**
|
119 |
+
- Built-in support for agentic behavior such as reasoning and action, making it perfect for AI-driven coding assistance.
|
120 |
+
- Handles large-scale coding projects with tools like search and calculator integration.
|
121 |
+
|
122 |
+
3. **Creative Writing & Roleplay**
|
123 |
+
- **Uncensored output** allows for rich, expressive writing ideal for novels, creative pieces, or roleplay scenarios.
|
124 |
+
- Capable of producing nuanced, emotionally complex character responses in roleplaying games or interactive storytelling.
|
125 |
+
|
126 |
+
4. **Unseen Task Generalization**
|
127 |
+
- With the latent diffusion capabilities, this model can handle unseen tasks by learning weight distributions in an adaptive manner, improving performance on novel datasets or tasks.
|
128 |
+
|
129 |
+
---
|
130 |
+
|
131 |
+
## Performance
|
132 |
+
|
133 |
+
- The model has shown significant improvements in **multi-domain reasoning**, **code generation**, and **unconstrained creative output**.
|
134 |
+
- **Enhanced task generalization** due to latent diffusion model blending techniques.
|
135 |
+
|
136 |
+
---
|
137 |
+
|
138 |
+
## Model Capabilities
|
139 |
+
|
140 |
+
- **Context Window**: 128K (capable of handling long-form tasks like novel writing and in-depth research).
|
141 |
+
- **Agentic Tools**: Built-in tools like search and calculator.
|
142 |
+
- **Safety**: While uncensored, responsible prompting is encouraged to ensure the best user experience and ethical usage.
|
143 |
+
|
144 |
+
---
|
145 |
+
|
146 |
+
## Usage
|
147 |
+
|
148 |
+
This model can be used in popular AI libraries like **Transformers** and **Langchain**. Below is a basic setup using **Transformers**:
|
149 |
+
|
150 |
+
### Example Code
|
151 |
+
|
152 |
+
```python
|
153 |
+
import transformers
|
154 |
+
import torch
|
155 |
+
|
156 |
+
model_id = "Llama3.1-TheiaFire-DarkFusion-8B"
|
157 |
+
|
158 |
+
pipeline = transformers.pipeline(
|
159 |
+
"text-generation",
|
160 |
+
model=model_id,
|
161 |
+
model_kwargs={"torch_dtype": torch.bfloat16},
|
162 |
+
device_map="auto",
|
163 |
+
)
|
164 |
+
|
165 |
+
messages = [
|
166 |
+
{"role": "system", "content": "You are an AI assistant skilled in coding and creative writing."},
|
167 |
+
{"role": "user", "content": "Please write me a Python function to compute the factorial of a number."}
|
168 |
+
]
|
169 |
+
|
170 |
+
outputs = pipeline(messages, max_new_tokens=256)
|
171 |
+
print(outputs[0]["generated_text"][-1])
|
172 |
+
```
|
173 |
+
|
174 |
+
---
|
175 |
+
|
176 |
+
## Limitations
|
177 |
+
|
178 |
+
- **Uncensored Output**: While this model offers creative freedom, it may produce content that could be considered inappropriate or unsuitable for certain contexts.
|
179 |
+
- **Bias**: As with all language models, this one may reflect inherent biases in the training data. Users are encouraged to review and edit the outputs before use.
|
180 |
+
|
181 |
+
---
|
182 |
+
|
183 |
+
## Acknowledgments
|
184 |
+
|
185 |
+
This model is a collective effort, combining the groundbreaking work from:
|
186 |
+
|
187 |
+
- **Chainbase Labs** (for Theia-Llama)
|
188 |
+
- **EpistemeAI** (for Fireball Meta-Llama)
|
189 |
+
- **Aifeifei798** (for DarkIdol)
|
190 |
+
- **DeepAutoAI** (for LDM Soup)
|
191 |
+
|
192 |
+
Special thanks to the open-source community and the developers who contributed to the training and fine-tuning of these models.
|
193 |
|
194 |
+
---
|