--- base_model: - HollowMan6/Llama-3.2-3B-SFT-Model-Ocra-500k - Lyte/Llama-3.2-3B-Overthinker - EryriLabs/Llama-3.2-SARA-3b library_name: transformers tags: - mergekit - merge --- # Llama-3.2-SARAThinker-merged-3b
SARA
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). It was developed as part of an ongoing blog series. I take no responsibility for the ouputs of this model. ## Merge Details ### Merge Method This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [EryriLabs/Llama-3.2-SARA-3b](https://huggingface.co./EryriLabs/Llama-3.2-SARA-3b) as a base. ### Models Merged The following models were included in the merge: * [HollowMan6/Llama-3.2-3B-SFT-Model-Ocra-500k](https://huggingface.co./HollowMan6/Llama-3.2-3B-SFT-Model-Ocra-500k) * [Lyte/Llama-3.2-3B-Overthinker](https://huggingface.co./Lyte/Llama-3.2-3B-Overthinker) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: EryriLabs/Llama-3.2-SARA-3b parameters: density: 0.99 # keeping 90% of base model weights - model: Lyte/Llama-3.2-3B-Overthinker parameters: density: 0.1 # fraction of weights in differences from the base model to retain weight: # weight gradient - filter: mlp value: 0.1 - value: 0 - model: HollowMan6/Llama-3.2-3B-SFT-Model-Ocra-500k parameters: density: 0.1 weight: 0.2 merge_method: ties base_model: EryriLabs/Llama-3.2-SARA-3b parameters: normalize: true int8_mask: true dtype: float16 ```