ZeroXClem
/

Llama3.1-TheiaFire-DarkFusion-8B

Model card Files Files and versions Community

Llama3.1-TheiaFire-DarkFusion-8B / README.md

ZeroXClem

Update README.md

3526d83 verified 14 days ago

preview code

raw

history blame contribute delete

7.3 kB

	---
	license: apache-2.0
	tags:
	- merge
	- model_fusion
	- TIES
	- Llama3.1
	- crypto
	- blockchain
	- coding_assistant
	- creative_writing
	- roleplaying
	- uncensored
	- latent_diffusion
	- long_context
	- agentic_AI
	- multi_domain
	- research
	- instruction-following
	- technical_reasoning
	- task_generalization
	- AI_tools
	- GPT
	base_model:
	- Chainbase-Labs/Theia-Llama-3.1-8B-v1
	- EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
	- aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
	- DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
	datasets:
	- CoinMarketCap
	- blockchain_projects
	- agentic_code_DPO
	libraries: transformers
	library_name: transformers
	---

	# ZeroXClem/Llama3.1-TheiaFire-DarkFusion-8B

	Architecture: Llama 3.1 - 8B
	Proposed Name: Llama3.1-TheiaFire-DarkFusion-8B
	Merge Method: TIES
	Merge Date: 10/25/2024
	License: Apache 2.0

	---

	## Model Overview

	The Llama3.1-TheiaFire-DarkFusion-8B is a highly specialized fusion of four cutting-edge models, meticulously combined to provide an exceptional balance of technical reasoning, creativity, and uncensored freedom for a variety of use cases. Whether you need advanced coding assistance, blockchain insights, creative roleplaying, or general-purpose AI capabilities, this model delivers state-of-the-art results.

	This model was merged using the TIES merge method to ensure optimal blending of layer weights and parameter configurations, resulting in a model that excels in multiple domains.

	---

	For optimal results, leave the system prompt blank within LMStudio. The tokenizer seems to struggle under system prompts.

	## Model Components

	The following models were merged to create Llama3.1-TheiaFire-DarkFusion-8B:

	1. [Theia-Llama-3.1-8B-v1](https://huggingface.co./Chainbase-Labs/Theia-Llama-3.1-8B-v1)
	- Purpose: Balances technical vision and crypto capabilities.
	- Training Focus: This model specializes in blockchain data and was trained on a large dataset of crypto whitepapers, research reports, and market data.
	- Unique Feature: Fine-tuned using LoRA for optimized crypto-specific performance.

	2. [EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO](https://huggingface.co./EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO)
	- Purpose: Specialized in agentic reasoning and advanced coding tasks.
	- Unique Feature: This model is equipped with a 128K context window and comes with built-in tools for ReAct, calculator, search, and more.

	3. [aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored](https://huggingface.co./aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored)
	- Purpose: Provides uncensored, creativity-driven responses ideal for writing, role-playing, and in-depth conversations.
	- Unique Feature: Uncensored nature allows for open exploration of creative writing and darker, more complex roleplay scenarios.

	4. [DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst](https://huggingface.co./DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst)
	- Purpose: Enhances performance with latent diffusion model blending.
	- Unique Feature: This model builds upon Llama-3.1’s foundation and improves unseen task generalization with latent diffusion.

	---

	## Model Specifications

	### Merge Configuration

	```yaml
	# Llama3.1-TheiaFire-DarkFusion-8B Merge Configuration
	models:
	- model: Chainbase-Labs/Theia-Llama-3.1-8B-v1
	parameters:
	density: 0.4 # Balancing technical vision and crypto capabilities
	weight: 0.3
	- model: EpistemeAI/Fireball-Meta-Llama-3.2-8B-Instruct-agent-003-128k-code-DPO
	parameters:
	density: 0.6 # Giving priority to code-based reasoning and agentic capabilities
	weight: 0.4
	- model: aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored
	parameters:
	density: 0.5 # Focus on creativity and uncensored roleplay flexibility
	weight: 0.2
	- model: DeepAutoAI/ldm_soup_Llama-3.1-8B-Inst
	parameters:
	density: 0.5 # Blending latent diffusion capabilities for unseen tasks
	weight: 0.1

	merge_method: ties
	base_model: Theia-Llama-3.1-8B-v1
	dtype: bfloat16
	parameters:
	normalize: true
	out_dtype: float16
	```

	---

	## Intended Use Cases

	1. Crypto Analysis & Blockchain Projects
	- Leverages data from CoinMarketCap and research reports for in-depth analysis of blockchain projects and crypto markets.
	- Ideal for creating blockchain-related content or automating crypto data analysis.

	2. Advanced Coding Assistant
	- Built-in support for agentic behavior such as reasoning and action, making it perfect for AI-driven coding assistance.
	- Handles large-scale coding projects with tools like search and calculator integration.

	3. Creative Writing & Roleplay
	- Uncensored output allows for rich, expressive writing ideal for novels, creative pieces, or roleplay scenarios.
	- Capable of producing nuanced, emotionally complex character responses in roleplaying games or interactive storytelling.

	4. Unseen Task Generalization
	- With the latent diffusion capabilities, this model can handle unseen tasks by learning weight distributions in an adaptive manner, improving performance on novel datasets or tasks.

	---

	## Performance

	- The model has shown significant improvements in multi-domain reasoning, code generation, and unconstrained creative output.
	- Enhanced task generalization due to latent diffusion model blending techniques.

	---

	## Model Capabilities

	- Context Window: 128K (capable of handling long-form tasks like novel writing and in-depth research).
	- Agentic Tools: Built-in tools like search and calculator.
	- Safety: While uncensored, responsible prompting is encouraged to ensure the best user experience and ethical usage.

	---

	## Usage

	This model can be used in popular AI libraries like Transformers and Langchain. Below is a basic setup using Transformers:

	### Example Code

	```python
	import transformers
	import torch

	model_id = "Llama3.1-TheiaFire-DarkFusion-8B"

	pipeline = transformers.pipeline(
	"text-generation",
	model=model_id,
	model_kwargs={"torch_dtype": torch.bfloat16},
	device_map="auto",
	)

	messages = [
	{"role": "system", "content": "You are an AI assistant skilled in coding and creative writing."},
	{"role": "user", "content": "Please write me a Python function to compute the factorial of a number."}
	]

	outputs = pipeline(messages, max_new_tokens=256)
	print(outputs[0]["generated_text"][-1])
	```

	---

	## Limitations

	- Uncensored Output: While this model offers creative freedom, it may produce content that could be considered inappropriate or unsuitable for certain contexts.
	- Bias: As with all language models, this one may reflect inherent biases in the training data. Users are encouraged to review and edit the outputs before use.

	---

	## Acknowledgments

	This model is a collective effort, combining the groundbreaking work from:

	- Chainbase Labs (for Theia-Llama)
	- EpistemeAI (for Fireball Meta-Llama)
	- Aifeifei798 (for DarkIdol)
	- DeepAutoAI (for LDM Soup)

	Special thanks to the open-source community and the developers who contributed to the training and fine-tuning of these models.

	---