--- base_model: - UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 library_name: transformers tags: - mergekit - merge --- # merged This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the passthrough merge method. ### Models Merged The following models were included in the merge: * [UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3](https://huggingface.co./UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3) ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: passthrough slices: - sources: - layer_range: [0, 24] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Original L24 - sources: - layer_range: [24, 25] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L24 - sources: - layer_range: [24, 25] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L24 - sources: - layer_range: [24, 25] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L25 - sources: - layer_range: [25, 26] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L25 - sources: - layer_range: [25, 26] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L25 - sources: - layer_range: [25, 26] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L26 - sources: - layer_range: [26, 27] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L26 - sources: - layer_range: [26, 27] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L26 - sources: - layer_range: [26, 27] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L27 - sources: - layer_range: [27, 28] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L27 - sources: - layer_range: [27, 28] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L27 - sources: - layer_range: [27, 28] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L28 - sources: - layer_range: [28, 29] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L28 - sources: - layer_range: [28, 29] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L28 - sources: - layer_range: [28, 29] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L29 - sources: - layer_range: [29, 30] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L29 - sources: - layer_range: [29, 30] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L29 - sources: - layer_range: [29, 30] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L30 - sources: - layer_range: [30, 31] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L30 - sources: - layer_range: [30, 31] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L30 - sources: - layer_range: [30, 31] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L31 - sources: - layer_range: [31, 32] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L31 - sources: - layer_range: [31, 32] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L31 - sources: - layer_range: [31, 32] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L32 - sources: - layer_range: [32, 33] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L32 - sources: - layer_range: [32, 33] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L32 - sources: - layer_range: [32, 33] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L33 - sources: - layer_range: [33, 34] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L33 - sources: - layer_range: [33, 34] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L33 - sources: - layer_range: [33, 34] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L34 - sources: - layer_range: [34, 35] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L34 - sources: - layer_range: [34, 35] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L34 - sources: - layer_range: [34, 35] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L35 - sources: - layer_range: [35, 36] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L35 - sources: - layer_range: [35, 36] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L35 - sources: - layer_range: [35, 36] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Original L36 - sources: - layer_range: [36, 37] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 # Dupe A of L36 - sources: - layer_range: [36, 37] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 # Dupe B of L36 - sources: - layer_range: [36, 37] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 parameters: scale: - filter: o_proj value: 0.0 - filter: down_proj value: 0.0 - value: 1.0 - sources: - layer_range: [37, 42] model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3 ```