---
base_model:
- akjindal53244/Llama-3.1-Storm-8B
- Casual-Autopsy/L3-Umbral-Mind-RP-v2.0-8B
library_name: transformers
tags:
- merge
- llama
- not-for-all-audiences
---

# Llama-3-Umbral-Storm-8B (8K)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/79tIjC6Ykm4rlwOHa9uzZ.png)

RP model, "L3-Umbral-Mind-v2.0" as a base, nearswapped with one of the smartest L3.1 models "Storm".

* Warning: Based on Mopey-Mule so it should be negative, don't use this model for any truthful information or advice.

* <b>----></b>[ GGUF Q8 static](https://huggingface.co./v000000/L3-Umbral-Storm-8B-t0.0001-Q8_0-GGUF)

# Thank you mradermacher for the quants!

* [GGUFs](https://huggingface.co./mradermacher/L3-Umbral-Storm-8B-t0.0001-GGUF)
* [GGUFs imatrix](https://huggingface.co./mradermacher/L3-Umbral-Storm-8B-t0.0001-i1-GGUF)

-------------------------------------------------------------------------------

## merge

This is a merge of pre-trained language models.

## Merge Details

This model is on the Llama-3 arch with Llama-3.1 merged in, so it has 8k context length. But could possibly be extended slightly with RoPE due to the L3.1 layers.

### Merge Method

This model was merged using the <b>NEARSWAP t0.0001</b> merge algorithm.

### Models Merged

The following models were included in the merge:
* Base Model: [Casual-Autopsy/L3-Umbral-Mind-RP-v2.0-8B](https://huggingface.co./Casual-Autopsy/L3-Umbral-Mind-RP-v2.0-8B)
* [akjindal53244/Llama-3.1-Storm-8B](https://huggingface.co./akjindal53244/Llama-3.1-Storm-8B)

### Configuration

```yaml
slices:
  - sources:
      - model: Casual-Autopsy/L3-Umbral-Mind-RP-v2.0-8B
        layer_range: [0, 32]
      - model: akjindal53244/Llama-3.1-Storm-8B
        layer_range: [0, 32]
merge_method: nearswap
base_model: Casual-Autopsy/L3-Umbral-Mind-RP-v2.0-8B
parameters:
  t:
    - value: 0.0001
dtype: bfloat16
```

# Prompt Template:
```bash
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>

{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{output}<|eot_id|>

```

Credit to Alchemonaut:

```python
def lerp(a, b, t):
    return a * (1 - t) + b * t

def nearswap(v0, v1, t):
    lweight = np.abs(v0 - v1)
    with np.errstate(divide='ignore', invalid='ignore'):
        lweight = np.where(lweight != 0, t / lweight, 1.0)
    lweight = np.nan_to_num(lweight, nan=1.0, posinf=1.0, neginf=1.0)
    np.clip(lweight, a_min=0.0, a_max=1.0, out=lweight)
    return lerp(v0, v1, lweight)
```

Credit to Numbra for idea.