--- license: apache-2.0 language: - en - zh base_model: - Qwen/Qwen2.5-14B - Azure99/Blossom-V6-14B - arcee-ai/Virtuoso-Small-v2 - Qwen/Qwen2.5-14B-Instruct - Qwen/Qwen2.5-14B-Instruct-1M pipeline_tag: text-generation tags: - merge --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e174e202fa032de4143324/8YkBIMWfWNXm0dbNwj2HH.png) # ZYH-LLM-Qwen2.5-14B-V3 This is the third-generation model of the **ZYH-LLM series**. It employs a large amount of model merging techniques, aiming to provide a **powerful and unified 14-billion-parameter model**, laying a solid foundation for further model merging and model fine-tuning. The following are the specific details of model merging, hoping to inspire you: ## First stage: ### Step 1: ```yaml models: - model: Qwen/Qwen2.5-14B-Instruct parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: Qwen/Qwen2.5-14B parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: Qwen2.5-14B-YOYO-1010 ``` ```yaml models: - model: Qwen/Qwen2.5-14B-Instruct-1M parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: Qwen/Qwen2.5-14B parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: Qwen2.5-14B-YOYO-1010-1M ``` ```yaml models: - model: Qwen/Qwen2.5-14B-Instruct parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2 parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: EVA-Qwen2.5-14B-YOYO-1010 ``` ```yaml models: - model: Qwen/Qwen2.5-14B-Instruct-1M parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2 parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: EVA-Qwen2.5-14B-YOYO-1010-1M ``` ### Step 2: ```yaml models: - model: EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2 parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: Qwen/Qwen2.5-14B parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: EVA-Qwen2.5-14B-base ``` ```yaml merge_method: sce models: - model: EVA-Qwen2.5-14B-base base_model: Qwen/Qwen2.5-14B-Instruct-1M parameters: select_topk: 1 dtype: bfloat16 tokenizer_source: base normalize: true int8_mask: true name: Qwen2.5-14B-pro ``` ### Step 3: ```yaml models: - model: Qwen2.5-14B-YOYO-1010-1M - model: Qwen2.5-14B-YOYO-1010 - model: EVA-Qwen2.5-14B-YOYO-1010-1M - model: EVA-Qwen2.5-14B-YOYO-1010 merge_method: sce base_model: Qwen2.5-14B-pro parameters: normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: ZYH-LLM-Qwen2.5-14B-V3-preview ``` ## Second stage: ```yaml models: - model: Qwen/Qwen2.5-14B-Instruct parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: arcee-ai/Virtuoso-Small-v2 parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: Qwen2.5-14B-YOYO-della1 ``` ```yaml models: - model: Qwen/Qwen2.5-14B-Instruct-1M parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: arcee-ai/Virtuoso-Small-v2 parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: Qwen2.5-14B-YOYO-della2 ``` ```yaml models: - model: Qwen/Qwen2.5-14B-Instruct parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: Azure99/Blossom-V6-14B parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: Qwen2.5-14B-YOYO-della3 ``` ```yaml models: - model: Qwen/Qwen2.5-14B-Instruct-1M parameters: density: 1 weight: 1 lambda: 0.9 merge_method: della base_model: Azure99/Blossom-V6-14B parameters: density: 1 weight: 1 lambda: 0.9 normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: base name: Qwen2.5-14B-YOYO-della4 ``` ## Final stage: ```yaml merge_method: model_stock base_model: ZYH-LLM-Qwen2.5-14B-V3-preview models: - model: Qwen2.5-14B-YOYO-della1 - model: Qwen2.5-14B-YOYO-della2 - model: Qwen2.5-14B-YOYO-della3 - model: Qwen2.5-14B-YOYO-della4 dtype: bfloat16 tokenizer_source: base int8_mask: true normalize: true name: ZYH-LLM-Qwen2.5-14B-V3 ```