Edit model card

God this sucked to make, I'm getting burnout with this fucker so I'm releasing what I have as a partial update. I'll be focusing on L3.1 for the time being and taking a break from this project for a bit. What's done is... fine, I'm not super happy about it but it's objectively better then v0.1.

Quants

GGUFs by mradermacher

GGUFs by BackyardAI

Details & Recommended Settings

(Still testing; subject to change)

Very experimental, expect bugs. Thrives at more story heavy and narrative RP yet still excels with the basics like usual. Way more horny this time, no idea why. Slightly worse instruct following compared to v0.1. Really needs examples to fuction, otherwise it'll spit out garble easily. I tried to also curb l3's tendency to double line (leaving a space between paragraphs) to mild success. Calmer writing style.

Has a certain tendency to speak for the {user}, but that's easily negated with a few instructs.

Rec. Settings:

Template: Plain Text or L3
Temperature: 1.3
Min P: 0.1
Repeat Penalty: 1.05
Repeat Penalty Tokens: 256

Models Merged & Merge Theory

The following models were included in the merge:

can't be asked rn

Config

slices:
- sources:
  - layer_range: [0, 6] #6
    model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
- sources:
  - layer_range: [4, 8] #10
    model: nothingiisreal/L3-8B-Celeste-V1
    parameters:
      scale:
      - filter: q_proj
        value: [1, 0.89]
- sources:
  - layer_range: [8, 12] #14
    model: nothingiisreal/L3-8B-Instruct-Abliterated-DWP
    parameters:
      scale:
      - filter: k_proj
        value: [0.89, 1]
- sources:
  - layer_range: [6, 12] #20
    model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
- sources:
  - layer_range: [10, 14] #24
    model: nothingiisreal/L3-8B-Celeste-V1
- sources:
  - layer_range: [12, 16] #28
    model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
- sources:
  - layer_range: [14, 28] #42
    model: nothingiisreal/L3-8B-Instruct-Abliterated-DWP
- sources:
  - layer_range: [21, 31] #52
    model: nothingiisreal/L3-8B-Celeste-V1
- sources:
  - layer_range: [28, 32] #56
    model: nothingiisreal/L3-8B-Instruct-Abliterated-DWP
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: anterosv2.b
---
slices:
- sources:
  - layer_range: [0, 4] #4
    model: ResplendentAI/Nymph_8B
- sources:
  - layer_range: [2, 10] #12
    model: Sao10K/L3-8B-Niitama-v1
- sources:
  - layer_range: [8, 12] #16
    model: TheDrummer/Llama-3SOME-8B-v2
- sources:
  - layer_range: [10, 16] #22
    model: Sao10K/L3-8B-Niitama-v1
- sources:
  - layer_range: [12, 20] #30
    model: ResplendentAI/Nymph_8B
- sources:
  - layer_range: [14, 18] #34
    model: OEvortex/Emotional-llama-8B
- sources:
  - layer_range: [16, 20] #38
    model: ResplendentAI/Nymph_8B
- sources:
  - layer_range: [18, 22] #42
    model: OEvortex/Emotional-llama-8B
- sources:
  - layer_range: [20, 24] #46
    model: TheDrummer/Llama-3SOME-8B-v2
- sources:
  - layer_range: [23, 31] #54
    model: ResplendentAI/Nymph_8B
- sources:
  - layer_range: [30, 32] #56
    model: Sao10K/L3-8B-Niitama-v1
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: anterosv2.c
---
slices:
- sources:
  - layer_range: [2, 17]
    model: anterosv2.b
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 2-17av2b.sl
---
slices:
- sources:
  - layer_range: [2, 17]
    model: anterosv2.c
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 2-17av2c.sl
---
models: 
  - model: 2-17av2c.sl
    parameters:
      weight: [0.1, 0.5, 0.3]
  - model: 2-17av2b.sl
    parameters:
      weight: [0.9, 0.5, 0.7]
merge_method: dare_linear
base_model: 2-17av2b.sl
parameters:
  normalize: false
  int8_mask: true
dtype: float32
out_dtype: bfloat16
name: 2-17aav2.sl
---
slices:
- sources:
  - layer_range: [17, 42]
    model: anterosv2.b
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 17-42av2b.sl
---
slices:
- sources:
  - layer_range: [17, 42]
    model: anterosv2.c
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 17-42av2c.sl
---
models: 
  - model: 17-42av2b.sl
    parameters:
      weight: [0.8, 0.5, 0.4, 0.3, 0.15]
      density: 0.65
      epsilon: 0.07
      lambda: 0.12
  - model: 17-42av2c.sl
    parameters:
      weight: [0.2, 0.5, 0.6, 0.7, 0.85]
      density: 0.7
      epsilon: 0.05
      lambda: 0.1
merge_method: della
base_model: 17-42av2c.sl
parameters:
  normalize: false
  int8_mask: true
dtype: float32
out_dtype: bfloat16
name: 17-42aav2.sl
---
slices:
- sources:
  - layer_range: [42, 52]
    model: anterosv2.b
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 42-52av2b.sl
---
slices:
- sources:
  - layer_range: [42, 52]
    model: anterosv2.c
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 42-52av2c.sl
---
models: 
  - model: 42-52av2c.sl
    parameters:
      weight: [0.9, 0.65, 0.9]
  - model: 42-52av2b.sl
    parameters:
      weight: [0.1, 0.35, 0.1]
merge_method: dare_linear
base_model: 42-52av2c.sl
parameters:
  normalize: false
  int8_mask: true
dtype: float32
out_dtype: bfloat16
name: 42-52aav2.sl
--- 
slices:
- sources:
  - layer_range: [0, 2]
    model: anterosv2.b
- sources:
  - layer_range: [0, 15]
    model: 2-17aav2.sl
- sources:
  - layer_range: [0, 25]
    model: 17-42aav2.sl
- sources:
  - layer_range: [0, 10]
    model: 42-52aav2.sl
- sources:
  - layer_range: [52, 56]
    model: anterosv2.c
parameters:
  int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: anterosv0.1.5
Downloads last month
9
Safetensors
Model size
13.3B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for kromeurus/L3-Horizon-Anteros-v0.1.5-13B