Lambent
/

braidbird-scribe-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Lambent commited on Jun 16, 2024

Commit

fc32b83

·

verified ·

1 Parent(s): e74af16

Update README.md

Files changed (1) hide show

README.md +11 -0

README.md CHANGED Viewed

@@ -34,6 +34,17 @@ This is a merge of pre-trained language models created using [mergekit](https://
 |eq_bench|    2.1|none  |     0|eqbench          |↑  | 78.7955|±  |1.4668|
 |        |       |none  |     0|percent_parseable|↑  |100.0000|±  |0.0000|
 ### Merge Method
 This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Lambent/threebird-scribe-alpha0.3-7B](https://huggingface.co/Lambent/threebird-scribe-alpha0.3-7B) as a base.

 |eq_bench|    2.1|none  |     0|eqbench          |↑  | 78.7955|±  |1.4668|
 |        |       |none  |     0|percent_parseable|↑  |100.0000|±  |0.0000|
+0.3 involved 3 separate tunes stock merged on overlapping datasets for long context writing, multi-turn conversation and RP, with a touch of poetry and code.
+From there, each of the four threads was separately task-tuned on 2 datasets each.
+Various methods of combining those via merge were tested, with this one scoring highest on EQ-Bench as an indicator.
+My understanding of the Model Stock merge method is that it mitigates task adaptation to a significant degree, but also significantly limits forgetting caused by training.
+I have hope that the adaptation, especially over two stages, is still sufficient to aid in longer contexts and multi-turn conversations from the ancestor models, and add some individual style while retaining a fair amount of their capability.
+This model's refusals are ... not nonexistent, but certainly don't rely on them.
+To my knowledge it has no particular refusal behavior for simply NSFW content, but I haven't exactly exhaustively tested which OSHA violations it will aid and abet.
 ### Merge Method
 This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Lambent/threebird-scribe-alpha0.3-7B](https://huggingface.co/Lambent/threebird-scribe-alpha0.3-7B) as a base.