ZeroXClem
/

Llama3.1-TheiaFire-DarkFusion-8B

Model card Files Files and versions Community

ZeroXClem commited on 14 days ago

Commit

034c330

•

1 Parent(s): 1b7b089

Update README.md

Browse files

Files changed (1) hide show

README.md +157 -12

README.md CHANGED Viewed

@@ -2,25 +2,85 @@
 license: apache-2.0
 tags:
 - merge
-- mergekit
-- lazymergekit
 - Chainbase-Labs/Theia-Llama-3.1-8B-v1
 - EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
 - aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
 - DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
 ---
 # ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B
-ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
-* [Chainbase-Labs/Theia-Llama-3.1-8B-v1](https://huggingface.co/Chainbase-Labs/Theia-Llama-3.1-8B-v1)
-* [EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO](https://huggingface.co/EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO)
-* [aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored](https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored)
-* [DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst](https://huggingface.co/DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst)
-## 🧩 Configuration
 ```yaml
 models:
   - model: Chainbase-Labs/Theia-Llama-3.1-8B-v1
     parameters:
@@ -40,10 +100,95 @@ models:
       weight: 0.1
 merge_method: ties
-base_model: DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst  # Using the ldm_soup as a base for optimal diffusion of strengths
 dtype: bfloat16
 parameters:
-  normalize: true  # Ensuring consistent scaling of layer outputs
-out_dtype: float16  # Maintaining efficiency and precision
-```

 license: apache-2.0
 tags:
 - merge
+- model_fusion
+- TIES
+- Llama3.1
+- crypto
+- blockchain
+- coding_assistant
+- creative_writing
+- roleplaying
+- uncensored
+- latent_diffusion
+- long_context
+- agentic_AI
+- multi_domain
+- research
+- instruction-following
+- technical_reasoning
+- task_generalization
+- AI_tools
+- GPT
+base_model:
 - Chainbase-Labs/Theia-Llama-3.1-8B-v1
 - EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
 - aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
 - DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
+datasets:
+- CoinMarketCap
+- blockchain_projects
+- agentic_code_DPO
+libraries: transformers
+library_name: transformers
 ---
 # ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B
+**Architecture:** Llama 3.1 - 8B
+**Proposed Name:** Llama3.1-TheiaFire-DarkFusion-8B
+**Merge Method:** TIES
+**Merge Date:** 10/25/2024
+**License:** Apache 2.0
+---
+## Model Overview
+The **Llama3.1-TheiaFire-DarkFusion-8B** is a highly specialized fusion of four cutting-edge models, meticulously combined to provide an exceptional balance of technical reasoning, creativity, and uncensored freedom for a variety of use cases. Whether you need advanced coding assistance, blockchain insights, creative roleplaying, or general-purpose AI capabilities, this model delivers state-of-the-art results.
+This model was merged using the **TIES** merge method to ensure optimal blending of layer weights and parameter configurations, resulting in a model that excels in multiple domains.
+---
+## Model Components
+The following models were merged to create **Llama3.1-TheiaFire-DarkFusion-8B**:
+1. **[Theia-Llama-3.1-8B-v1](https://huggingface.co/Chainbase-Labs/Theia-Llama-3.1-8B-v1)**
+   - **Purpose:** Balances technical vision and crypto capabilities.
+   - **Training Focus:** This model specializes in blockchain data and was trained on a large dataset of crypto whitepapers, research reports, and market data.
+   - **Unique Feature:** Fine-tuned using LoRA for optimized crypto-specific performance.
+2. **[EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO](https://huggingface.co/EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO)**
+   - **Purpose:** Specialized in agentic reasoning and advanced coding tasks.
+   - **Unique Feature:** This model is equipped with a 128K context window and comes with built-in tools for ReAct, calculator, search, and more.
+3. **[aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored](https://huggingface.co/aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored)**
+   - **Purpose:** Provides uncensored, creativity-driven responses ideal for writing, role-playing, and in-depth conversations.
+   - **Unique Feature:** Uncensored nature allows for open exploration of creative writing and darker, more complex roleplay scenarios.
+4. **[DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst](https://huggingface.co/DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst)**
+   - **Purpose:** Enhances performance with latent diffusion model blending.
+   - **Unique Feature:** This model builds upon Llama-3.1’s foundation and improves unseen task generalization with latent diffusion.
+---
+## Model Specifications
+### Merge Configuration
 ```yaml
+# Llama3.1-TheiaFire-DarkFusion-8B Merge Configuration
 models:
   - model: Chainbase-Labs/Theia-Llama-3.1-8B-v1
     parameters:
       weight: 0.1
 merge_method: ties
+base_model: Theia-Llama-3.1-8B-v1
 dtype: bfloat16
 parameters:
+  normalize: true
+out_dtype: float16
+```
+---
+## Intended Use Cases
+1. **Crypto Analysis & Blockchain Projects**
+   - Leverages data from CoinMarketCap and research reports for in-depth analysis of blockchain projects and crypto markets.
+   - Ideal for creating blockchain-related content or automating crypto data analysis.
+2. **Advanced Coding Assistant**
+   - Built-in support for agentic behavior such as reasoning and action, making it perfect for AI-driven coding assistance.
+   - Handles large-scale coding projects with tools like search and calculator integration.
+3. **Creative Writing & Roleplay**
+   - **Uncensored output** allows for rich, expressive writing ideal for novels, creative pieces, or roleplay scenarios.
+   - Capable of producing nuanced, emotionally complex character responses in roleplaying games or interactive storytelling.
+4. **Unseen Task Generalization**
+   - With the latent diffusion capabilities, this model can handle unseen tasks by learning weight distributions in an adaptive manner, improving performance on novel datasets or tasks.
+---
+## Performance
+- The model has shown significant improvements in **multi-domain reasoning**, **code generation**, and **unconstrained creative output**.
+- **Enhanced task generalization** due to latent diffusion model blending techniques.
+---
+## Model Capabilities
+- **Context Window**: 128K (capable of handling long-form tasks like novel writing and in-depth research).
+- **Agentic Tools**: Built-in tools like search and calculator.
+- **Safety**: While uncensored, responsible prompting is encouraged to ensure the best user experience and ethical usage.
+---
+## Usage
+This model can be used in popular AI libraries like **Transformers** and **Langchain**. Below is a basic setup using **Transformers**:
+### Example Code
+```python
+import transformers
+import torch
+model_id = "Llama3.1-TheiaFire-DarkFusion-8B"
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model_id,
+    model_kwargs={"torch_dtype": torch.bfloat16},
+    device_map="auto",
+)
+messages = [
+    {"role": "system", "content": "You are an AI assistant skilled in coding and creative writing."},
+    {"role": "user", "content": "Please write me a Python function to compute the factorial of a number."}
+]
+outputs = pipeline(messages, max_new_tokens=256)
+print(outputs[0]["generated_text"][-1])
+```
+---
+## Limitations
+- **Uncensored Output**: While this model offers creative freedom, it may produce content that could be considered inappropriate or unsuitable for certain contexts.
+- **Bias**: As with all language models, this one may reflect inherent biases in the training data. Users are encouraged to review and edit the outputs before use.
+---
+## Acknowledgments
+This model is a collective effort, combining the groundbreaking work from:
+- **Chainbase Labs** (for Theia-Llama)
+- **EpistemeAI** (for Fireball Meta-Llama)
+- **Aifeifei798** (for DarkIdol)
+- **DeepAutoAI** (for LDM Soup)
+Special thanks to the open-source community and the developers who contributed to the training and fine-tuning of these models.
+---