Text Generation
Transformers
Safetensors
llama
Merge
model_fusion
TIES
Llama3.1
crypto
blockchain
coding_assistant
creative_writing
roleplaying
uncensored
latent_diffusion
long_context
agentic_AI
multi_domain
research
instruction-following
technical_reasoning
task_generalization
AI_tools
GPT
conversational
text-generation-inference
Inference Endpoints
license: apache-2.0 | |
tags: | |
- merge | |
- model_fusion | |
- TIES | |
- Llama3.1 | |
- crypto | |
- blockchain | |
- coding_assistant | |
- creative_writing | |
- roleplaying | |
- uncensored | |
- latent_diffusion | |
- long_context | |
- agentic_AI | |
- multi_domain | |
- research | |
- instruction-following | |
- technical_reasoning | |
- task_generalization | |
- AI_tools | |
- GPT | |
base_model: | |
- Chainbase-Labs/Theia-Llama-3.1-8B-v1 | |
- EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO | |
- aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored | |
- DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst | |
datasets: | |
- CoinMarketCap | |
- blockchain_projects | |
- agentic_code_DPO | |
libraries: transformers | |
library_name: transformers | |
# ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B | |
**Architecture:** Llama 3.1 - 8B | |
**Proposed Name:** Llama3.1-TheiaFire-DarkFusion-8B | |
**Merge Method:** TIES | |
**Merge Date:** 10/25/2024 | |
**License:** Apache 2.0 | |
--- | |
## Model Overview | |
The **Llama3.1-TheiaFire-DarkFusion-8B** is a highly specialized fusion of four cutting-edge models, meticulously combined to provide an exceptional balance of technical reasoning, creativity, and uncensored freedom for a variety of use cases. Whether you need advanced coding assistance, blockchain insights, creative roleplaying, or general-purpose AI capabilities, this model delivers state-of-the-art results. | |
This model was merged using the **TIES** merge method to ensure optimal blending of layer weights and parameter configurations, resulting in a model that excels in multiple domains. | |
--- | |
For optimal results, leave the system prompt blank within LMStudio. The tokenizer seems to struggle under system prompts. | |
## Model Components | |
The following models were merged to create **Llama3.1-TheiaFire-DarkFusion-8B**: | |
1. **[Theia-Llama-3.1-8B-v1](https://huggingface.co./Chainbase-Labs/Theia-Llama-3.1-8B-v1)** | |
- **Purpose:** Balances technical vision and crypto capabilities. | |
- **Training Focus:** This model specializes in blockchain data and was trained on a large dataset of crypto whitepapers, research reports, and market data. | |
- **Unique Feature:** Fine-tuned using LoRA for optimized crypto-specific performance. | |
2. **[EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO](https://huggingface.co./EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO)** | |
- **Purpose:** Specialized in agentic reasoning and advanced coding tasks. | |
- **Unique Feature:** This model is equipped with a 128K context window and comes with built-in tools for ReAct, calculator, search, and more. | |
3. **[aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored](https://huggingface.co./aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored)** | |
- **Purpose:** Provides uncensored, creativity-driven responses ideal for writing, role-playing, and in-depth conversations. | |
- **Unique Feature:** Uncensored nature allows for open exploration of creative writing and darker, more complex roleplay scenarios. | |
4. **[DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst](https://huggingface.co./DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst)** | |
- **Purpose:** Enhances performance with latent diffusion model blending. | |
- **Unique Feature:** This model builds upon Llama-3.1’s foundation and improves unseen task generalization with latent diffusion. | |
--- | |
## Model Specifications | |
### Merge Configuration | |
```yaml | |
# Llama3.1-TheiaFire-DarkFusion-8B Merge Configuration | |
models: | |
- model: Chainbase-Labs/Theia-Llama-3.1-8B-v1 | |
parameters: | |
density: 0.4 # Balancing technical vision and crypto capabilities | |
weight: 0.3 | |
- model: EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO | |
parameters: | |
density: 0.6 # Giving priority to code-based reasoning and agentic capabilities | |
weight: 0.4 | |
- model: aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored | |
parameters: | |
density: 0.5 # Focus on creativity and uncensored roleplay flexibility | |
weight: 0.2 | |
- model: DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst | |
parameters: | |
density: 0.5 # Blending latent diffusion capabilities for unseen tasks | |
weight: 0.1 | |
merge_method: ties | |
base_model: Theia-Llama-3.1-8B-v1 | |
dtype: bfloat16 | |
parameters: | |
normalize: true | |
out_dtype: float16 | |
``` | |
--- | |
## Intended Use Cases | |
1. **Crypto Analysis & Blockchain Projects** | |
- Leverages data from CoinMarketCap and research reports for in-depth analysis of blockchain projects and crypto markets. | |
- Ideal for creating blockchain-related content or automating crypto data analysis. | |
2. **Advanced Coding Assistant** | |
- Built-in support for agentic behavior such as reasoning and action, making it perfect for AI-driven coding assistance. | |
- Handles large-scale coding projects with tools like search and calculator integration. | |
3. **Creative Writing & Roleplay** | |
- **Uncensored output** allows for rich, expressive writing ideal for novels, creative pieces, or roleplay scenarios. | |
- Capable of producing nuanced, emotionally complex character responses in roleplaying games or interactive storytelling. | |
4. **Unseen Task Generalization** | |
- With the latent diffusion capabilities, this model can handle unseen tasks by learning weight distributions in an adaptive manner, improving performance on novel datasets or tasks. | |
--- | |
## Performance | |
- The model has shown significant improvements in **multi-domain reasoning**, **code generation**, and **unconstrained creative output**. | |
- **Enhanced task generalization** due to latent diffusion model blending techniques. | |
--- | |
## Model Capabilities | |
- **Context Window**: 128K (capable of handling long-form tasks like novel writing and in-depth research). | |
- **Agentic Tools**: Built-in tools like search and calculator. | |
- **Safety**: While uncensored, responsible prompting is encouraged to ensure the best user experience and ethical usage. | |
--- | |
## Usage | |
This model can be used in popular AI libraries like **Transformers** and **Langchain**. Below is a basic setup using **Transformers**: | |
### Example Code | |
```python | |
import transformers | |
import torch | |
model_id = "Llama3.1-TheiaFire-DarkFusion-8B" | |
pipeline = transformers.pipeline( | |
"text-generation", | |
model=model_id, | |
model_kwargs={"torch_dtype": torch.bfloat16}, | |
device_map="auto", | |
) | |
messages = [ | |
{"role": "system", "content": "You are an AI assistant skilled in coding and creative writing."}, | |
{"role": "user", "content": "Please write me a Python function to compute the factorial of a number."} | |
] | |
outputs = pipeline(messages, max_new_tokens=256) | |
print(outputs[0]["generated_text"][-1]) | |
``` | |
--- | |
## Limitations | |
- **Uncensored Output**: While this model offers creative freedom, it may produce content that could be considered inappropriate or unsuitable for certain contexts. | |
- **Bias**: As with all language models, this one may reflect inherent biases in the training data. Users are encouraged to review and edit the outputs before use. | |
--- | |
## Acknowledgments | |
This model is a collective effort, combining the groundbreaking work from: | |
- **Chainbase Labs** (for Theia-Llama) | |
- **EpistemeAI** (for Fireball Meta-Llama) | |
- **Aifeifei798** (for DarkIdol) | |
- **DeepAutoAI** (for LDM Soup) | |
Special thanks to the open-source community and the developers who contributed to the training and fine-tuning of these models. | |
--- |