thesunday's picture
Add base models to model card metadata
d5f004c verified
|
raw
history blame
1.57 kB
metadata
license: apache-2.0
language:
  - en
tags:
  - merge
base_model:
  - teknium/OpenHermes-2.5-Mistral-7B
  - Intel/neural-chat-7b-v3-3

Model Description

This is an experiment to compare merging 2 models using DARE TIES versus SLERP 🦙

We are mainly interested to compare against Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp

The 2 models involved in the merge as follows:

  1. teknium/OpenHermes-2.5-Mistral-7B
  2. Intel/neural-chat-7b-v3-3

The yaml config file for the merge is:

models:
  - model: mistralai/Mistral-7B-v0.1
    # no parameters necessary for base model
  - model: teknium/OpenHermes-2.5-Mistral-7B
    parameters:
      weight: 0.5
      density: 0.5
  - model: Intel/neural-chat-7b-v3-3
    parameters:
      weight: 0.5
      density: 0.5
merge_method: dare_ties
base_model: mistralai/Mistral-7B-v0.1
parameters:
  int8_mask: true
dtype: bfloat16

Open LLM Leaderboard

Note that with more tuning DARE TIES might achieve better results.

DARE TIES SLERP
Average 70.69 71.38
ARC 67.49 68.09
HellaSwag 85.78 86.2
MMLU 64.1 64.26
TruthfulQA 60.52 62.78
Winogrande 79.01 79.16
GSM8K 67.25 67.78