--- base_model: - v000000/MN-12B-Part1 - v000000/MN-12B-Part2 library_name: transformers tags: - mergekit - merge - mistral --- Mistral-Nemo-12B-Estrella-v1 --------------------------------------------------------------------- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/nGi9VcdMMmRVbykIUJn2P.png) Untested! Mistral Instruct / ChatML format. # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was with a multi-step merge using the DELLA, DELLA_LINEAR and SLERP merge method. ### Models Merged The following models were included in the merge: * [nothingiisreal/MN-12B-Celeste-V1.9](https://huggingface.co./nothingiisreal/MN-12B-Celeste-V1.9) * [shuttleai/shuttle-2.5-mini](https://huggingface.co./shuttleai/shuttle-2.5-mini) * [anthracite-org/magnum-12b-v2](https://huggingface.co./anthracite-org/magnum-12b-v2) * [Sao10K/MN-12B-Lyra-v1](https://huggingface.co./Sao10K/MN-12B-Lyra-v1) * [unsloth/Mistral-Nemo-Instruct-2407](https://huggingface.co./unsloth/Mistral-Nemo-Instruct-2407) * [NeverSleep/Lumimaid-v0.2-12B](https://huggingface.co./NeverSleep/Lumimaid-v0.2-12B) * [UsernameJustAnother/Nemo-12B-Marlin-v5](https://huggingface.co./UsernameJustAnother/Nemo-12B-Marlin-v5) * [BeaverAI/mistral-doryV2-12b](https://huggingface.co./BeaverAI/mistral-doryV2-12b) * [invisietch/Atlantis-v0.1-12B](https://huggingface.co./invisietch/Atlantis-v0.1-12B) ### Configuration The following YAML configuration was used to produce this model: ```yaml #Step 1 models: - model: Sao10K/MN-12B-Lyra-v1 parameters: weight: 0.15 density: 0.77 - model: shuttleai/shuttle-2.5-mini parameters: weight: 0.20 density: 0.78 - model: anthracite-org/magnum-12b-v2 parameters: weight: 0.35 density: 0.85 - model: nothingiisreal/MN-12B-Celeste-V1.9 parameters: weight: 0.55 density: 0.90 merge_method: della base_model: Sao10K/MN-12B-Lyra-v1 parameters: int8_mask: true epsilon: 0.05 lambda: 1 dtype: bfloat16 #Step 2 models: - model: BeaverAI/mistral-doryV2-12b parameters: weight: 0.10 density: 0.4 - model: unsloth/Mistral-Nemo-Instruct-2407 parameters: weight: 0.20 density: 0.4 - model: UsernameJustAnother/Nemo-12B-Marlin-v5 parameters: weight: 0.25 density: 0.5 - model: invisietch/Atlantis-v0.1-12B parameters: weight: 0.3 density: 0.5 - model: NeverSleep/Lumimaid-v0.2-12B parameters: weight: 0.4 density: 0.8 merge_method: della_linear base_model: anthracite-org/magnum-12b-v2 parameters: int8_mask: true epsilon: 0.05 lambda: 1 dtype: bfloat16 #Step 3 slices: - sources: - model: v000000/MN-12B-Part2 layer_range: [0, 40] - model: v000000/MN-12B-Part1 layer_range: [0, 40] merge_method: slerp base_model: v000000/MN-12B-Part1 parameters: #smooth gradient prio part1 t: - filter: self_attn value: [0, 0.5, 0.3, 0.7, 0.6, 0.1, 0.6, 0.3, 0.8, 0.5] - filter: mlp value: [0, 0.5, 0.4, 0.3, 0, 0.3, 0.4, 0.7, 0.2, 0.5] - value: 0.5 dtype: bfloat16 ```