Nitral-Archive
/

NightWing3-10B-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

NightWing3-10B-v0.1 / README.md

Nitral-AI's picture

Update README.md

3c0a2e3 verified about 2 months ago

|

history blame contribute delete

1.34 kB

	---
	base_model:
	- Nitral-Archive/nightwing3-r64-2-latest_test-train-10B
	- Nitral-Archive/nightwing3-r64-1-latest_test-train-10B
	library_name: transformers
	tags:
	- mergekit
	- merge
	license: other
	language:
	- en
	---
	# Noticed some weird behavior in 4bpw exl2, not sure if this is contained or a model related issue. However after seeing some recent bugfixes regarding the targetting of lm training heads among a few other things, i will be attemping to retrain this for comparitive sake.

	# Base model: (Falcon3-10B)

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/642265bc01c62c1e4102dc36/C6gY9vxCl3_SFzQLpLG0S.png)

	# Prompt format: ChatML
	```
	<\|im_start\|>system
	{system_prompt}<\|im_end\|>
	<\|im_start\|>user
	{prompt}<\|im_end\|>
	<\|im_start\|>assistant
	```

	### The following YAML configuration was used to produce this model: (SLERP merge method)

	```yaml
	slices:
	- sources:
	- model: Nitral-Archive/nightwing3-r64-1-latest_test-train-10B
	layer_range: [0, 40]
	- model: Nitral-Archive/nightwing3-r64-2-latest_test-train-10B
	layer_range: [0, 40]
	merge_method: slerp
	base_model: Nitral-Archive/nightwing3-r64-1-latest_test-train-10B
	parameters:
	t:
	- filter: self_attn
	value: [0, 0.5, 0.3, 0.7, 1]
	- filter: mlp
	value: [1, 0.5, 0.7, 0.3, 0]
	- value: 0.420
	dtype: bfloat16

	```