--- base_model: - Lambent/danube2-upscale-1.53lisa - Lambent/danube2-upscale-1.51galore - Lambent/danube2-upscale-1.531qlora - Lambent/danube2-upscale-1.51qlora library_name: transformers tags: - mergekit - merge datasets: - HuggingFaceTB/cosmopedia-100k - Vezora/Tested-22k-Python-Alpaca - sordonia/redpajama-sample_from_valid_all - nampdn-ai/tiny-bridgedict - teknium/GPTeacher-General-Instruct - Severian/Internal-Knowledge-Map - Severian/Internal-Knowledge-Map-StoryWriter-RolePlaying license: apache-2.0 --- # eq90parsedanube This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). First one that's shown promising capability improvement over the base model `h2o-danube2-1.8b-base`. Training methodology ... is a bit of a mess, trying out different things. I'm adding the datasets used at any point, but I don't think replicating the recipe is doable or sensible. Original upscale at Lambent/danube2-upscale-1, duplicating layers 16-21. Various training methods attempted to repair. Linear merge is of the 4 that were at least 90% parseable by the EQ-Bench benchmark. | Model |AGIEval|GPT4All|TruthfulQA|Bigbench|Average| |-------------------------------------------------------------------------|------:|------:|---------:|-------:|------:| |[danube2-upscale-1.7](https://huggingface.co./Lambent/danube2-upscale-1.7)| 27.97| 62.16| 42.2| 32.2| 41.13| | Model |EQ-Bench|Average| |-------------------------------------------------------------------------|-------:|------:| |[danube2-upscale-1.7](https://huggingface.co./Lambent/danube2-upscale-1.7)| 15.52| 15.52| ### EQ-Bench | Task |Version| Metric | Value | |Stderr| |--------|------:|-----------------------------|--------|---|------| |eq_bench| 2.1|eqbench,none | 15.52| | | | | |eqbench_stderr,none | 2.77| | | | | |percent_parseable,none | 100| | | | | |percent_parseable_stderr,none| 0| | | | | |alias |eq_bench| | | Average: 15.52% Average score: 15.52% ## Merge Details ### Merge Method This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method. ### Models Merged The following models were included in the merge: * [Lambent/danube2-upscale-1.53lisa](https://huggingface.co./Lambent/danube2-upscale-1.53lisa) * [Lambent/danube2-upscale-1.51galore](https://huggingface.co./Lambent/danube2-upscale-1.51galore) * [Lambent/danube2-upscale-1.531qlora](https://huggingface.co./Lambent/danube2-upscale-1.531qlora) * [Lambent/danube2-upscale-1.51qlora](https://huggingface.co./Lambent/danube2-upscale-1.51qlora) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: Lambent/danube2-upscale-1.531qlora parameters: weight: 1.0 - model: Lambent/danube2-upscale-1.53lisa parameters: weight: 1.0 - model: Lambent/danube2-upscale-1.51galore parameters: weight: 1.0 - model: Lambent/danube2-upscale-1.51qlora parameters: weight: 1.0 merge_method: linear dtype: float16 ```