ZeroXClem's picture
Update README.md
3526d83 verified
---
license: apache-2.0
tags:
- merge
- model_fusion
- TIES
- Llama3.1
- crypto
- blockchain
- coding_assistant
- creative_writing
- roleplaying
- uncensored
- latent_diffusion
- long_context
- agentic_AI
- multi_domain
- research
- instruction-following
- technical_reasoning
- task_generalization
- AI_tools
- GPT
base_model:
- Chainbase-Labs/Theia-Llama-3.1-8B-v1
- EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
- aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
- DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
datasets:
- CoinMarketCap
- blockchain_projects
- agentic_code_DPO
libraries: transformers
library_name: transformers
---
# ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B
**Architecture:** Llama 3.1 - 8B
**Proposed Name:** Llama3.1-TheiaFire-DarkFusion-8B
**Merge Method:** TIES
**Merge Date:** 10/25/2024
**License:** Apache 2.0
---
## Model Overview
The **Llama3.1-TheiaFire-DarkFusion-8B** is a highly specialized fusion of four cutting-edge models, meticulously combined to provide an exceptional balance of technical reasoning, creativity, and uncensored freedom for a variety of use cases. Whether you need advanced coding assistance, blockchain insights, creative roleplaying, or general-purpose AI capabilities, this model delivers state-of-the-art results.
This model was merged using the **TIES** merge method to ensure optimal blending of layer weights and parameter configurations, resulting in a model that excels in multiple domains.
---
For optimal results, leave the system prompt blank within LMStudio. The tokenizer seems to struggle under system prompts.
## Model Components
The following models were merged to create **Llama3.1-TheiaFire-DarkFusion-8B**:
1. **[Theia-Llama-3.1-8B-v1](https://huggingface.co./Chainbase-Labs/Theia-Llama-3.1-8B-v1)**
- **Purpose:** Balances technical vision and crypto capabilities.
- **Training Focus:** This model specializes in blockchain data and was trained on a large dataset of crypto whitepapers, research reports, and market data.
- **Unique Feature:** Fine-tuned using LoRA for optimized crypto-specific performance.
2. **[EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO](https://huggingface.co./EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO)**
- **Purpose:** Specialized in agentic reasoning and advanced coding tasks.
- **Unique Feature:** This model is equipped with a 128K context window and comes with built-in tools for ReAct, calculator, search, and more.
3. **[aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored](https://huggingface.co./aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored)**
- **Purpose:** Provides uncensored, creativity-driven responses ideal for writing, role-playing, and in-depth conversations.
- **Unique Feature:** Uncensored nature allows for open exploration of creative writing and darker, more complex roleplay scenarios.
4. **[DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst](https://huggingface.co./DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst)**
- **Purpose:** Enhances performance with latent diffusion model blending.
- **Unique Feature:** This model builds upon Llama-3.1’s foundation and improves unseen task generalization with latent diffusion.
---
## Model Specifications
### Merge Configuration
```yaml
# Llama3.1-TheiaFire-DarkFusion-8B Merge Configuration
models:
- model: Chainbase-Labs/Theia-Llama-3.1-8B-v1
parameters:
density: 0.4 # Balancing technical vision and crypto capabilities
weight: 0.3
- model: EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
parameters:
density: 0.6 # Giving priority to code-based reasoning and agentic capabilities
weight: 0.4
- model: aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
parameters:
density: 0.5 # Focus on creativity and uncensored roleplay flexibility
weight: 0.2
- model: DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
parameters:
density: 0.5 # Blending latent diffusion capabilities for unseen tasks
weight: 0.1
merge_method: ties
base_model: Theia-Llama-3.1-8B-v1
dtype: bfloat16
parameters:
normalize: true
out_dtype: float16
```
---
## Intended Use Cases
1. **Crypto Analysis & Blockchain Projects**
- Leverages data from CoinMarketCap and research reports for in-depth analysis of blockchain projects and crypto markets.
- Ideal for creating blockchain-related content or automating crypto data analysis.
2. **Advanced Coding Assistant**
- Built-in support for agentic behavior such as reasoning and action, making it perfect for AI-driven coding assistance.
- Handles large-scale coding projects with tools like search and calculator integration.
3. **Creative Writing & Roleplay**
- **Uncensored output** allows for rich, expressive writing ideal for novels, creative pieces, or roleplay scenarios.
- Capable of producing nuanced, emotionally complex character responses in roleplaying games or interactive storytelling.
4. **Unseen Task Generalization**
- With the latent diffusion capabilities, this model can handle unseen tasks by learning weight distributions in an adaptive manner, improving performance on novel datasets or tasks.
---
## Performance
- The model has shown significant improvements in **multi-domain reasoning**, **code generation**, and **unconstrained creative output**.
- **Enhanced task generalization** due to latent diffusion model blending techniques.
---
## Model Capabilities
- **Context Window**: 128K (capable of handling long-form tasks like novel writing and in-depth research).
- **Agentic Tools**: Built-in tools like search and calculator.
- **Safety**: While uncensored, responsible prompting is encouraged to ensure the best user experience and ethical usage.
---
## Usage
This model can be used in popular AI libraries like **Transformers** and **Langchain**. Below is a basic setup using **Transformers**:
### Example Code
```python
import transformers
import torch
model_id = "Llama3.1-TheiaFire-DarkFusion-8B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "You are an AI assistant skilled in coding and creative writing."},
{"role": "user", "content": "Please write me a Python function to compute the factorial of a number."}
]
outputs = pipeline(messages, max_new_tokens=256)
print(outputs[0]["generated_text"][-1])
```
---
## Limitations
- **Uncensored Output**: While this model offers creative freedom, it may produce content that could be considered inappropriate or unsuitable for certain contexts.
- **Bias**: As with all language models, this one may reflect inherent biases in the training data. Users are encouraged to review and edit the outputs before use.
---
## Acknowledgments
This model is a collective effort, combining the groundbreaking work from:
- **Chainbase Labs** (for Theia-Llama)
- **EpistemeAI** (for Fireball Meta-Llama)
- **Aifeifei798** (for DarkIdol)
- **DeepAutoAI** (for LDM Soup)
Special thanks to the open-source community and the developers who contributed to the training and fine-tuning of these models.
---