Update README.md
Browse files
README.md
CHANGED
@@ -3,18 +3,64 @@ license: apache-2.0
|
|
3 |
tags:
|
4 |
- merge
|
5 |
- mergekit
|
6 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
- ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B
|
8 |
-
- Casual-Autopsy/L3-bluuwhale-SAO-MIX-8B-V1_fp32-merge-calc
|
9 |
---
|
10 |
|
11 |
-
#
|
12 |
|
13 |
-
|
14 |
-
* [ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B](https://huggingface.co/ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B)
|
15 |
-
* [Casual-Autopsy/L3-bluuwhale-SAO-MIX-8B-V1_fp32-merge-calc](https://huggingface.co/Casual-Autopsy/L3-bluuwhale-SAO-MIX-8B-V1_fp32-merge-calc)
|
16 |
|
17 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
|
19 |
```yaml
|
20 |
merge_method: della_linear
|
@@ -25,10 +71,7 @@ parameters:
|
|
25 |
int8_mask: true
|
26 |
normalize: true
|
27 |
|
28 |
-
# Base model
|
29 |
base_model: ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B
|
30 |
-
|
31 |
-
# Models to merge, including the base model itself again
|
32 |
models:
|
33 |
- model: ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B
|
34 |
parameters:
|
@@ -39,4 +82,62 @@ models:
|
|
39 |
weight: 1
|
40 |
density: 0.55
|
41 |
|
42 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
tags:
|
4 |
- merge
|
5 |
- mergekit
|
6 |
+
- della-linear
|
7 |
+
- Hermes3
|
8 |
+
- SuperNova
|
9 |
+
- Purosani
|
10 |
+
- Llama3.1
|
11 |
+
- Casual-Autopsy/L3-bluuwhale-SAO-MIX-8B
|
12 |
+
- instruction-following
|
13 |
+
- long-form-generation
|
14 |
+
- storytelling
|
15 |
+
base_model:
|
16 |
- ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B
|
|
|
17 |
---
|
18 |
|
19 |
+
# **L3SAO-Mix-SuperHermes-NovaPurosani-8B**
|
20 |
|
21 |
+
**L3SAO-Mix-SuperHermes-NovaPurosani-8B** is an innovative merged model that combines high-performance elements from two prominent models to create a powerhouse capable of excelling in a wide range of tasks. Whether it's for **instruction-following**, **roleplaying**, or **complex storytelling**, this model is designed for adaptability and precision.
|
|
|
|
|
22 |
|
23 |
+
## 🌟 **Family Tree**
|
24 |
+
|
25 |
+
This model is a **hybrid** of the following:
|
26 |
+
|
27 |
+
- [**ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B**](<https://huggingface.co/ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B>)
|
28 |
+
- [**Casual-Autopsy/L3-bluuwhale-SAO-MIX-8B-V1_fp32-merge-calc**](<https://huggingface.co/Casual-Autopsy/L3-bluuwhale-SAO-MIX-8B-V1_fp32-merge-calc>)
|
29 |
+
|
30 |
+
These models are themselves built upon a solid foundation of advanced AI architectures, ensuring a model that’s both **robust** and **versatile** for multiple applications.
|
31 |
+
|
32 |
+
## 🌳 **Model Family Genealogy**
|
33 |
+
|
34 |
+
This model represents the fusion of **Hermes3**'s instruction-following prowess and **bluuwhale's** rich contextual understanding, making it perfect for tasks that require **long-form generation** and **complex contextual analysis**.
|
35 |
+
|
36 |
+
---
|
37 |
+
|
38 |
+
## 🧬 **Detailed Model Lineage**
|
39 |
+
|
40 |
+
### **A: ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B**
|
41 |
+
|
42 |
+
This model is built from:
|
43 |
+
|
44 |
+
- **NousResearch/Hermes-3-Llama-3.1-8B**: Known for its strong instruction-following capabilities and contextual understanding.
|
45 |
+
- **THUDM/LongWriter-llama3.1-8B**: Focused on **long-form content generation**, capable of handling over 10,000 words in a single pass, making it perfect for detailed content creation.
|
46 |
+
|
47 |
+
### **B: Casual-Autopsy/L3-bluuwhale-SAO-MIX-8B-V1**
|
48 |
+
|
49 |
+
This model incorporates components from:
|
50 |
+
|
51 |
+
- **Sao10K/L3-8B-Stheno-v3.2**
|
52 |
+
- **Sao10K/L3-8B-Tamamo-v1**
|
53 |
+
- **Sao10K/L3-8B-Lunaris-v1**
|
54 |
+
|
55 |
+
Its primary strengths lie in **instructional roleplaying** and **creative content generation**.
|
56 |
+
|
57 |
+
---
|
58 |
+
|
59 |
+
## 🛠️ **Merge Details**
|
60 |
+
|
61 |
+
This model was merged using the **Della Linear** method with **bfloat16** precision. The process involved merging key elements from both parent models to balance **instruction-following** with **creative contextual analysis**.
|
62 |
+
|
63 |
+
The following YAML configuration was used during the merge:
|
64 |
|
65 |
```yaml
|
66 |
merge_method: della_linear
|
|
|
71 |
int8_mask: true
|
72 |
normalize: true
|
73 |
|
|
|
74 |
base_model: ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B
|
|
|
|
|
75 |
models:
|
76 |
- model: ZeroXClem/Llama3.1-Hermes3-SuperNova-8B-L3.1-Purosani-2-8B
|
77 |
parameters:
|
|
|
82 |
weight: 1
|
83 |
density: 0.55
|
84 |
|
85 |
+
```
|
86 |
+
|
87 |
+
---
|
88 |
+
|
89 |
+
## 🎯 **Extended Roleplay & Storytelling Features**
|
90 |
+
|
91 |
+
With its heritage from **SuperNova** and **bluuwhale**, this model excels in **immersive storytelling** and **dynamic roleplay scenarios**. It can handle:
|
92 |
+
|
93 |
+
- **Long-form character development**: Crafting rich, nuanced personalities for interactive narratives.
|
94 |
+
- **World-building & lore**: Generating detailed worlds and interconnected lore on the fly.
|
95 |
+
- **Dynamic dialogues**: Perfect for game development, this model can generate complex, believable conversations for NPCs in real-time.
|
96 |
+
|
97 |
+
---
|
98 |
+
|
99 |
+
## 🚀 **Key Features & Capabilities**
|
100 |
+
|
101 |
+
### **1. Long-Form Content Generation**
|
102 |
+
|
103 |
+
This model is ideal for generating large bodies of text without losing coherence, making it perfect for:
|
104 |
+
|
105 |
+
- **Research papers**
|
106 |
+
- **Novels**
|
107 |
+
- **Detailed reports**
|
108 |
+
|
109 |
+
### **2. Advanced Instruction-Following**
|
110 |
+
|
111 |
+
Thanks to its **Hermes3** roots, this model can effectively follow complex instructions for:
|
112 |
+
|
113 |
+
- **Task automation**
|
114 |
+
- **AI assistants**
|
115 |
+
- **Research and summarization tasks**
|
116 |
+
|
117 |
+
### **3. Roleplay and Storytelling**
|
118 |
+
|
119 |
+
The model’s ability to handle both short and long interactions makes it perfect for:
|
120 |
+
|
121 |
+
- **Roleplaying games**
|
122 |
+
- **Interactive storytelling**
|
123 |
+
- **Narrative creation**
|
124 |
+
|
125 |
+
---
|
126 |
+
|
127 |
+
## 📜 **License**
|
128 |
+
|
129 |
+
This model is available under the **Apache-2.0 License**, allowing users to utilize and modify it freely with attribution.
|
130 |
+
|
131 |
+
## 💡 **Tags**
|
132 |
+
|
133 |
+
- `merge`
|
134 |
+
- `mergekit`
|
135 |
+
- `Hermes3`
|
136 |
+
- `SuperNova`
|
137 |
+
- `Purosani`
|
138 |
+
- `Llama3.1`
|
139 |
+
- `instruction-following`
|
140 |
+
- `long-form-generation`
|
141 |
+
- `storytelling`
|
142 |
+
|
143 |
+
---
|