|
--- |
|
base_model: |
|
- bunnycore/Llama-3.2-3B-Creative |
|
- bunnycore/Llama-3.2-3B-Mix |
|
- bunnycore/Llama-3.2-3B-Pure-RP |
|
- bunnycore/Llama-3.2-3B-Stock |
|
- bunnycore/Llama-3.2-3B-TitanFusion-v2 |
|
- CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct |
|
- Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated |
|
- Hastagaras/L3.2-JametMini-3B-MK.III |
|
- huihui-ai/Llama-3.2-3B-Instruct-abliterated |
|
- Lyte/Llama-3.2-3B-Overthinker |
|
- passing2961/Thanos-3B |
|
- SaisExperiments/Evil-Alpaca-3B-L3.2 |
|
- ValiantLabs/Llama3.2-3B-Enigma |
|
- ValiantLabs/Llama3.2-3B-ShiningValiant2 |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
- bfloat16 |
|
- safetensors |
|
- llama |
|
- llama-3 |
|
- llama-3.2 |
|
- 3b |
|
- chat |
|
- creative |
|
- conversational |
|
- not-for-all-audiences |
|
language: |
|
- en |
|
- ru |
|
|
|
--- |
|
# Llama-3.2-Kapusta-3B-v8 |
|
|
|
>Small and useful. |
|
|
|
![KapustaLogo256.png](https://cdn-uploads.huggingface.co/production/uploads/673125091920e70ac26c8a2e/vRJ1aZxx5o-_7yC2L2YCL.png) |
|
|
|
This is an interesting merge of **14 cool models**, created using [mergekit](https://github.com/arcee-ai/mergekit). |
|
Enjoy exploring :) |
|
|
|
## Merge Details |
|
### Method |
|
|
|
This model was merged using the multistep process and remerge with some model variations for best result. |
|
|
|
### Models |
|
|
|
The following models were included in the merge: |
|
|
|
* [bunnycore/Llama-3.2-3B-Creative](https://huggingface.co./bunnycore/Llama-3.2-3B-Creative) |
|
* [bunnycore/Llama-3.2-3B-Mix](https://huggingface.co./bunnycore/Llama-3.2-3B-Mix) |
|
* [bunnycore/Llama-3.2-3B-Pure-RP](https://huggingface.co./bunnycore/Llama-3.2-3B-Pure-RP) |
|
* [bunnycore/Llama-3.2-3B-Stock](https://huggingface.co./bunnycore/Llama-3.2-3B-Stock) |
|
* [bunnycore/Llama-3.2-3B-TitanFusion-v2](https://huggingface.co./bunnycore/Llama-3.2-3B-TitanFusion-v2) |
|
* [CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct](https://huggingface.co./CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct) |
|
* [Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated](https://huggingface.co./Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated) |
|
* [Hastagaras/L3.2-JametMini-3B-MK.III](https://huggingface.co./Hastagaras/L3.2-JametMini-3B-MK.III) |
|
* [huihui-ai/Llama-3.2-3B-Instruct-abliterated](https://huggingface.co./huihui-ai/Llama-3.2-3B-Instruct-abliterated) |
|
* [Lyte/Llama-3.2-3B-Overthinker](https://huggingface.co./Lyte/Llama-3.2-3B-Overthinker) |
|
* [passing2961/Thanos-3B](https://huggingface.co./passing2961/Thanos-3B) |
|
* [SaisExperiments/Evil-Alpaca-3B-L3.2](https://huggingface.co./SaisExperiments/Evil-Alpaca-3B-L3.2) |
|
* [ValiantLabs/Llama3.2-3B-Enigma](https://huggingface.co./ValiantLabs/Llama3.2-3B-Enigma) |
|
* [ValiantLabs/Llama3.2-3B-ShiningValiant2](https://huggingface.co./ValiantLabs/Llama3.2-3B-ShiningValiant2) |
|
|
|
### Configuration |
|
|
|
The following YAML configurations was used to produce this model: |
|
|
|
```yaml |
|
# A-3B-v1 |
|
models: |
|
- model: Hastagaras/L3.2-JametMini-3B-MK.III |
|
- model: huihui-ai/Llama-3.2-3B-Instruct-abliterated |
|
merge_method: model_stock |
|
base_model: Lyte/Llama-3.2-3B-Overthinker |
|
dtype: bfloat16 |
|
|
|
# B-3B-v1 |
|
models: |
|
- model: ValiantLabs/Llama3.2-3B-ShiningValiant2 |
|
- model: bunnycore/Llama-3.2-3B-Stock |
|
merge_method: model_stock |
|
base_model: Lyte/Llama-3.2-3B-Overthinker |
|
dtype: bfloat16 |
|
|
|
# C-3B-v1 |
|
models: |
|
- model: bunnycore/Llama-3.2-3B-Pure-RP |
|
- model: CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct |
|
merge_method: model_stock |
|
base_model: ValiantLabs/Llama3.2-3B-ShiningValiant2 |
|
dtype: bfloat16 |
|
|
|
# Llama-3.2-Kapusta-3B-v1 |
|
models: |
|
- model: A-3B-v1 |
|
parameters: |
|
density: [0.8, 0.5, 0.2] |
|
weight: 0.8 |
|
- model: B-3B-v1 |
|
parameters: |
|
density: [0.2, 0.8, 0.2] |
|
weight: 0.25 |
|
- model: C-3B-v1 |
|
parameters: |
|
density: [0.2, 0.5, 0.8] |
|
weight: 0.6 |
|
merge_method: ties |
|
base_model: bunnycore/Llama-3.2-3B-Mix |
|
dtype: bfloat16 |
|
|
|
# Llama-3.2-Kapusta-3B-v2 |
|
models: |
|
- model: SaisExperiments/Evil-Alpaca-3B-L3.2 |
|
- model: bunnycore/Llama-3.2-3B-TitanFusion-v2 |
|
merge_method: model_stock |
|
base_model: F:/3b/Llama-3.2-Kapusta-3B-v1 |
|
dtype: bfloat16 |
|
|
|
# Llama-3.2-Kapusta-3B-v3 |
|
models: |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v2 |
|
parameters: |
|
weight: [0.5, 0.6, 0.4, 0.7, 0.3, 0.8, 0.2, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.8, 0.2, 0.7, 0.3, 0.6, 0.4, 0.5] |
|
density: [0.2, 0.8, 0.2] |
|
merge_method: della |
|
parameters: |
|
epsilon: 0.1 |
|
lambda: 0.5 |
|
base_model: F:/3b/Llama-3.2-Kapusta-3B-v1 |
|
dtype: bfloat16 |
|
|
|
# Llama-3.2-Kapusta-3B-v4A | della |
|
models: |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v1 |
|
parameters: |
|
weight: 0.6 |
|
density: 0.5 |
|
- model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated |
|
parameters: |
|
weight: 0.5 |
|
density: [0.4, 0.3, 0.3, 0.3] |
|
- model: bunnycore/Llama-3.2-3B-Creative |
|
parameters: |
|
weight: 0.3 |
|
density: [0.3, 0.4, 0.3, 0.3] |
|
- model: ValiantLabs/Llama3.2-3B-Enigma |
|
parameters: |
|
weight: 0.3 |
|
density: [0.3, 0.3, 0.4, 0.3] |
|
- model: passing2961/Thanos-3B |
|
parameters: |
|
weight: 0.3 |
|
density: [0.3, 0.3, 0.3, 0.4] |
|
merge_method: della |
|
parameters: |
|
epsilon: 0.2 |
|
lambda: 0.5 |
|
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3 |
|
dtype: bfloat16 |
|
|
|
# Llama-3.2-Kapusta-3B-v4B | breadcrumbs |
|
models: |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v1 |
|
parameters: |
|
weight: 0.6 |
|
density: 0.5 |
|
- model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated |
|
parameters: |
|
weight: 0.5 |
|
density: [0.4, 0.3, 0.3, 0.3] |
|
- model: bunnycore/Llama-3.2-3B-Creative |
|
parameters: |
|
weight: 0.3 |
|
density: [0.3, 0.4, 0.3, 0.3] |
|
- model: ValiantLabs/Llama3.2-3B-Enigma |
|
parameters: |
|
weight: 0.3 |
|
density: [0.3, 0.3, 0.4, 0.3] |
|
- model: passing2961/Thanos-3B |
|
parameters: |
|
weight: 0.3 |
|
density: [0.3, 0.3, 0.3, 0.4] |
|
merge_method: breadcrumbs |
|
parameters: |
|
gamma: 0.02 |
|
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3 |
|
dtype: bfloat16 |
|
|
|
# Llama-3.2-Kapusta-3B-v4C | dare_ties |
|
models: |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v1 |
|
parameters: |
|
weight: 0.6 |
|
density: 0.5 |
|
- model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated |
|
parameters: |
|
weight: 0.5 |
|
density: [0.4, 0.3, 0.3, 0.3] |
|
- model: bunnycore/Llama-3.2-3B-Creative |
|
parameters: |
|
weight: 0.3 |
|
density: [0.3, 0.4, 0.3, 0.3] |
|
- model: ValiantLabs/Llama3.2-3B-Enigma |
|
parameters: |
|
weight: 0.3 |
|
density: [0.3, 0.3, 0.4, 0.3] |
|
- model: passing2961/Thanos-3B |
|
parameters: |
|
weight: 0.3 |
|
density: [0.3, 0.3, 0.3, 0.4] |
|
merge_method: dare_ties |
|
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3 |
|
dtype: bfloat16 |
|
|
|
# Llama-3.2-Kapusta-3B-v5 |
|
models: |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v4A |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v4B |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v4C |
|
merge_method: model_stock |
|
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3 |
|
dtype: bfloat16 |
|
|
|
# Llama-3.2-Kapusta-3B-v6 |
|
models: |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v3 |
|
merge_method: slerp |
|
base_model: F:/3b/Llama-3.2-Kapusta-3B-v5 |
|
dtype: bfloat16 |
|
parameters: |
|
t: [0.5, 0.6, 0.4, 0.7, 0.3, 0.8, 0.2, 0.8, 0.2, 0.7, 0.3, 0.6, 0.4, 0.5] |
|
|
|
# Llama-3.2-Kapusta-3B-v7 |
|
models: |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v1 |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v2 |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v3 |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v4A |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v4B |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v4C |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v5 |
|
merge_method: model_stock |
|
base_model: F:/3b/Llama-3.2-Kapusta-3B-v6 |
|
dtype: bfloat16 |
|
|
|
# Llama-3.2-Kapusta-3B-v8 |
|
models: |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v1 |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v3 |
|
- model: F:/3b/Llama-3.2-Kapusta-3B-v5 |
|
merge_method: model_stock |
|
base_model: F:/3b/Llama-3.2-Kapusta-3B-v7 |
|
dtype: bfloat16 |
|
``` |
|
|
|
>My thanks to the authors of the original models, your work is incredible. Have a good time 🖤 |
|
|