qwen-14b-1m-distillthink-sce / mergekit_config.yml
Ba2han's picture
Upload folder using huggingface_hub
49610ec verified
raw
history blame contribute delete
298 Bytes
merge_method: sce
base_model: Qwen/Qwen2.5-14B
models:
- model: Qwen/Qwen2.5-14B-Instruct-1M
- model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
- model: Qwen/Qwen2.5-Coder-14B-Instruct
parameters:
select_topk: 0.75 # retain top 75% high-variance parameters
dtype: bfloat16
normalize: true