---
base_model:
- Lambent/danube2-upscale-1.53lisa
- Lambent/danube2-upscale-1.51galore
- Lambent/danube2-upscale-1.531qlora
- Lambent/danube2-upscale-1.51qlora
library_name: transformers
tags:
- mergekit
- merge
datasets:
- HuggingFaceTB/cosmopedia-100k
- Vezora/Tested-22k-Python-Alpaca
- sordonia/redpajama-sample_from_valid_all
- nampdn-ai/tiny-bridgedict
- teknium/GPTeacher-General-Instruct
- Severian/Internal-Knowledge-Map
- Severian/Internal-Knowledge-Map-StoryWriter-RolePlaying
license: apache-2.0
---
# eq90parsedanube

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

First one that's shown promising capability improvement over the base model `h2o-danube2-1.8b-base`.

Training methodology ... is a bit of a mess, trying out different things.
I'm adding the datasets used at any point, but I don't think replicating the recipe is doable or sensible.

Original upscale at Lambent/danube2-upscale-1, duplicating layers 16-21. Various training methods attempted to repair.
Linear merge is of the 4 that were at least 90% parseable by the EQ-Bench benchmark.

|                                  Model                                  |AGIEval|GPT4All|TruthfulQA|Bigbench|Average|
|-------------------------------------------------------------------------|------:|------:|---------:|-------:|------:|
|[danube2-upscale-1.7](https://huggingface.co./Lambent/danube2-upscale-1.7)|  27.97|  62.16|      42.2|    32.2|  41.13|

|                                  Model                                  |EQ-Bench|Average|
|-------------------------------------------------------------------------|-------:|------:|
|[danube2-upscale-1.7](https://huggingface.co./Lambent/danube2-upscale-1.7)|   15.52|  15.52|

### EQ-Bench
|  Task  |Version|           Metric            | Value  |   |Stderr|
|--------|------:|-----------------------------|--------|---|------|
|eq_bench|    2.1|eqbench,none                 |   15.52|   |      |
|        |       |eqbench_stderr,none          |    2.77|   |      |
|        |       |percent_parseable,none       |     100|   |      |
|        |       |percent_parseable_stderr,none|       0|   |      |
|        |       |alias                        |eq_bench|   |      |

Average: 15.52%

Average score: 15.52%

## Merge Details
### Merge Method

This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method.

### Models Merged

The following models were included in the merge:
* [Lambent/danube2-upscale-1.53lisa](https://huggingface.co./Lambent/danube2-upscale-1.53lisa)
* [Lambent/danube2-upscale-1.51galore](https://huggingface.co./Lambent/danube2-upscale-1.51galore)
* [Lambent/danube2-upscale-1.531qlora](https://huggingface.co./Lambent/danube2-upscale-1.531qlora)
* [Lambent/danube2-upscale-1.51qlora](https://huggingface.co./Lambent/danube2-upscale-1.51qlora)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
models:
  - model: Lambent/danube2-upscale-1.531qlora
    parameters:
      weight: 1.0
  - model: Lambent/danube2-upscale-1.53lisa
    parameters:
      weight: 1.0
  - model: Lambent/danube2-upscale-1.51galore
    parameters:
      weight: 1.0
  - model: Lambent/danube2-upscale-1.51qlora
    parameters:
      weight: 1.0
merge_method: linear
dtype: float16


```