Khetterman's picture
Update README.md
6833b37 verified
---
base_model:
- bunnycore/Llama-3.2-3B-Creative
- bunnycore/Llama-3.2-3B-Mix
- bunnycore/Llama-3.2-3B-Pure-RP
- bunnycore/Llama-3.2-3B-Stock
- bunnycore/Llama-3.2-3B-TitanFusion-v2
- CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct
- Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated
- Hastagaras/L3.2-JametMini-3B-MK.III
- huihui-ai/Llama-3.2-3B-Instruct-abliterated
- Lyte/Llama-3.2-3B-Overthinker
- passing2961/Thanos-3B
- SaisExperiments/Evil-Alpaca-3B-L3.2
- ValiantLabs/Llama3.2-3B-Enigma
- ValiantLabs/Llama3.2-3B-ShiningValiant2
library_name: transformers
tags:
- mergekit
- merge
- bfloat16
- safetensors
- llama
- llama-3
- llama-3.2
- 3b
- chat
- creative
- conversational
- not-for-all-audiences
language:
- en
- ru
---
# Llama-3.2-Kapusta-3B-v8
>Small and useful.
![KapustaLogo256.png](https://cdn-uploads.huggingface.co/production/uploads/673125091920e70ac26c8a2e/vRJ1aZxx5o-_7yC2L2YCL.png)
This is an interesting merge of **14 cool models**, created using [mergekit](https://github.com/arcee-ai/mergekit).
Enjoy exploring :)
## Merge Details
### Method
This model was merged using the multistep process and remerge with some model variations for best result.
### Models
The following models were included in the merge:
* [bunnycore/Llama-3.2-3B-Creative](https://huggingface.co./bunnycore/Llama-3.2-3B-Creative)
* [bunnycore/Llama-3.2-3B-Mix](https://huggingface.co./bunnycore/Llama-3.2-3B-Mix)
* [bunnycore/Llama-3.2-3B-Pure-RP](https://huggingface.co./bunnycore/Llama-3.2-3B-Pure-RP)
* [bunnycore/Llama-3.2-3B-Stock](https://huggingface.co./bunnycore/Llama-3.2-3B-Stock)
* [bunnycore/Llama-3.2-3B-TitanFusion-v2](https://huggingface.co./bunnycore/Llama-3.2-3B-TitanFusion-v2)
* [CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct](https://huggingface.co./CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct)
* [Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated](https://huggingface.co./Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated)
* [Hastagaras/L3.2-JametMini-3B-MK.III](https://huggingface.co./Hastagaras/L3.2-JametMini-3B-MK.III)
* [huihui-ai/Llama-3.2-3B-Instruct-abliterated](https://huggingface.co./huihui-ai/Llama-3.2-3B-Instruct-abliterated)
* [Lyte/Llama-3.2-3B-Overthinker](https://huggingface.co./Lyte/Llama-3.2-3B-Overthinker)
* [passing2961/Thanos-3B](https://huggingface.co./passing2961/Thanos-3B)
* [SaisExperiments/Evil-Alpaca-3B-L3.2](https://huggingface.co./SaisExperiments/Evil-Alpaca-3B-L3.2)
* [ValiantLabs/Llama3.2-3B-Enigma](https://huggingface.co./ValiantLabs/Llama3.2-3B-Enigma)
* [ValiantLabs/Llama3.2-3B-ShiningValiant2](https://huggingface.co./ValiantLabs/Llama3.2-3B-ShiningValiant2)
### Configuration
The following YAML configurations was used to produce this model:
```yaml
# A-3B-v1
models:
- model: Hastagaras/L3.2-JametMini-3B-MK.III
- model: huihui-ai/Llama-3.2-3B-Instruct-abliterated
merge_method: model_stock
base_model: Lyte/Llama-3.2-3B-Overthinker
dtype: bfloat16
# B-3B-v1
models:
- model: ValiantLabs/Llama3.2-3B-ShiningValiant2
- model: bunnycore/Llama-3.2-3B-Stock
merge_method: model_stock
base_model: Lyte/Llama-3.2-3B-Overthinker
dtype: bfloat16
# C-3B-v1
models:
- model: bunnycore/Llama-3.2-3B-Pure-RP
- model: CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct
merge_method: model_stock
base_model: ValiantLabs/Llama3.2-3B-ShiningValiant2
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v1
models:
- model: A-3B-v1
parameters:
density: [0.8, 0.5, 0.2]
weight: 0.8
- model: B-3B-v1
parameters:
density: [0.2, 0.8, 0.2]
weight: 0.25
- model: C-3B-v1
parameters:
density: [0.2, 0.5, 0.8]
weight: 0.6
merge_method: ties
base_model: bunnycore/Llama-3.2-3B-Mix
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v2
models:
- model: SaisExperiments/Evil-Alpaca-3B-L3.2
- model: bunnycore/Llama-3.2-3B-TitanFusion-v2
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v1
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v3
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v2
parameters:
weight: [0.5, 0.6, 0.4, 0.7, 0.3, 0.8, 0.2, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.8, 0.2, 0.7, 0.3, 0.6, 0.4, 0.5]
density: [0.2, 0.8, 0.2]
merge_method: della
parameters:
epsilon: 0.1
lambda: 0.5
base_model: F:/3b/Llama-3.2-Kapusta-3B-v1
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v4A | della
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v1
parameters:
weight: 0.6
density: 0.5
- model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated
parameters:
weight: 0.5
density: [0.4, 0.3, 0.3, 0.3]
- model: bunnycore/Llama-3.2-3B-Creative
parameters:
weight: 0.3
density: [0.3, 0.4, 0.3, 0.3]
- model: ValiantLabs/Llama3.2-3B-Enigma
parameters:
weight: 0.3
density: [0.3, 0.3, 0.4, 0.3]
- model: passing2961/Thanos-3B
parameters:
weight: 0.3
density: [0.3, 0.3, 0.3, 0.4]
merge_method: della
parameters:
epsilon: 0.2
lambda: 0.5
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v4B | breadcrumbs
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v1
parameters:
weight: 0.6
density: 0.5
- model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated
parameters:
weight: 0.5
density: [0.4, 0.3, 0.3, 0.3]
- model: bunnycore/Llama-3.2-3B-Creative
parameters:
weight: 0.3
density: [0.3, 0.4, 0.3, 0.3]
- model: ValiantLabs/Llama3.2-3B-Enigma
parameters:
weight: 0.3
density: [0.3, 0.3, 0.4, 0.3]
- model: passing2961/Thanos-3B
parameters:
weight: 0.3
density: [0.3, 0.3, 0.3, 0.4]
merge_method: breadcrumbs
parameters:
gamma: 0.02
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v4C | dare_ties
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v1
parameters:
weight: 0.6
density: 0.5
- model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated
parameters:
weight: 0.5
density: [0.4, 0.3, 0.3, 0.3]
- model: bunnycore/Llama-3.2-3B-Creative
parameters:
weight: 0.3
density: [0.3, 0.4, 0.3, 0.3]
- model: ValiantLabs/Llama3.2-3B-Enigma
parameters:
weight: 0.3
density: [0.3, 0.3, 0.4, 0.3]
- model: passing2961/Thanos-3B
parameters:
weight: 0.3
density: [0.3, 0.3, 0.3, 0.4]
merge_method: dare_ties
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v5
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v4A
- model: F:/3b/Llama-3.2-Kapusta-3B-v4B
- model: F:/3b/Llama-3.2-Kapusta-3B-v4C
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v6
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v3
merge_method: slerp
base_model: F:/3b/Llama-3.2-Kapusta-3B-v5
dtype: bfloat16
parameters:
t: [0.5, 0.6, 0.4, 0.7, 0.3, 0.8, 0.2, 0.8, 0.2, 0.7, 0.3, 0.6, 0.4, 0.5]
# Llama-3.2-Kapusta-3B-v7
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v1
- model: F:/3b/Llama-3.2-Kapusta-3B-v2
- model: F:/3b/Llama-3.2-Kapusta-3B-v3
- model: F:/3b/Llama-3.2-Kapusta-3B-v4A
- model: F:/3b/Llama-3.2-Kapusta-3B-v4B
- model: F:/3b/Llama-3.2-Kapusta-3B-v4C
- model: F:/3b/Llama-3.2-Kapusta-3B-v5
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v6
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v8
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v1
- model: F:/3b/Llama-3.2-Kapusta-3B-v3
- model: F:/3b/Llama-3.2-Kapusta-3B-v5
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v7
dtype: bfloat16
```
>My thanks to the authors of the original models, your work is incredible. Have a good time 🖤