--- base_model: - UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 - princeton-nlp/gemma-2-9b-it-SimPO library_name: transformers tags: - mergekit - merge license: gemma pipeline_tag: text-generation --- # Gemma2-Nephilim-v3-9B This repo contains a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). Though none of the components of this merge were trained for roleplay nor intended for it, the model can be used effectively in that role. Tested with temperature 1 and minP 0.01. This model leans toward being creative, so adjust temperature upward or downward as desired. The Instruct template used during testing can be found below: - [context template](https://huggingface.co./debased-ai/SillyTavern-settings/blob/main/advanced_formatting/context_template/Gemma2%20Unleashed3.json) - [instruct prompt](https://huggingface.co./debased-ai/SillyTavern-settings/blob/main/advanced_formatting/instruct_mode/Gemma2%20Unleashed3.json) Afterword: In subsequent testing, I encountered an occasional lapse in tracking context for complex scenarios, which seems to originate in the SPPO model. This lapse is not present in [grimjim/Kitsunebi-v1-Gemma2-8k-9B](https://huggingface.co./grimjim/Kitsunebi-v1-Gemma2-8k-9B). ## Merge Details ### Merge Method This model was merged using the SLERP merge method. ### Models Merged The following models were included in the merge: * [UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3](https://huggingface.co./UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3) * [princeton-nlp/gemma-2-9b-it-SimPO](https://huggingface.co./princeton-nlp/gemma-2-9b-it-SimPO) ### Configuration The following YAML configuration was used to produce this model: ```yaml slices: - sources: - model: princeton-nlp/gemma-2-9b-it-SimPO layer_range: - 0 - 42 - model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 layer_range: - 0 - 42 merge_method: slerp base_model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: t: - filter: self_attn value: - 0 - 0.5 - 0.3 - 0.7 - 1 - filter: mlp value: - 1 - 0.5 - 0.7 - 0.3 - 0 - value: 0.5 dtype: bfloat16 ```