sometimesanotion
/

LamarckInfusion-14B-v1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

LamarckInfusion-14B-v1 / README.md

sometimesanotion's picture

sometimesanotion

Update README.md

aef1556 verified 3 days ago

|

history blame contribute delete

1.84 kB

	---
	base_model:
	- sometimesanotion/Lamarck-14B-v0.7-Fusion
	- sometimesanotion/Lamarck-14B-v0.7
	library_name: transformers
	tags:
	- mergekit
	- merge

	---
	# merge

	The merits of multi-stage arcee_fusion merges are clearly shown in [sometimesanotion/Lamarck-14B-v0.7-Fusion](https://huggingface.co./sometimesanotion/Lamarck-14B-v0.7-Fusion), which has a valuable uptick in GPQA over its predecessors. Will its gains be maintained with a modified version of the SLERP recipe from [suayptalha/Lamarckvergence-14B](https://huggingface.co./suayptalha/Lamarckvergence-14B)? Let's find out what these weights for self-attention and perceptrons can unlock in this merge.

	## Merge Details
	### Merge Method

	This model was merged using the [SLERP](https://en.wikipedia.org/wiki/Slerp) merge method.

	### Models Merged

	The following models were included in the merge:
	* [sometimesanotion/Lamarck-14B-v0.7-Fusion](https://huggingface.co./sometimesanotion/Lamarck-14B-v0.7-Fusion)
	* [sometimesanotion/Lamarck-14B-v0.7](https://huggingface.co./sometimesanotion/Lamarck-14B-v0.7)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	name: LamarckInfusion-14B-v1
	base_model: sometimesanotion/Lamarck-14B-v0.7
	merge_method: slerp
	tokenizer_source: base
	dtype: float32
	out_dtype: bfloat16
	parameters:
	t:
	- filter: self_attn
	value: [0.2, 0.5, 0.4, 0.6, 0.8]
	- filter: mlp
	value: [0.8, 0.5, 0.6, 0.4, 0.2]
	- value: [ 0.00, 0.00, 0.08, 0.16, 0.32, 0.48, 0.48, 0.48, 0.48, 0.48, 0.40, 0.32, 0.24 ]
	slices:
	- sources:
	- model: sometimesanotion/Lamarck-14B-v0.7
	layer_range: [ 0, 48 ]
	- model: sometimesanotion/Lamarck-14B-v0.7-Fusion
	layer_range: [ 0, 48 ]

	```