Kunoichi-DPO-v2-7B / README.md
SanjiWatsuki's picture
Update README.md
c480a34 verified
|
raw
history blame
No virus
2.57 kB
| Model | MT Bench | EQ Bench | MMLU | Logic Test |
|----------------------|----------|----------|---------|-------------|
| GPT-4-Turbo | 9.32 | - | - | - |
| GPT-4 | 8.99 | 62.52 | 86.4 | 0.86 |
| **Kunoichi-DPO-v2-7B** | **8.51** | **42.18** | - | **0.58** |
| **Kunoichi-DPO-7B** | **8.29** | **41.60** | - | **0.59** |
| **Kunoichi-7B** | **8.14** | **44.32** | **64.9** | **0.58** |
| Starling-7B | 8.09 | - | 63.9 | 0.51 |
| Claude-2 | 8.06 | 52.14 | 78.5 | - |
| Silicon-Maid-7B | 7.96 | 40.44 | 64.7 | 0.54 |
| Loyal-Macaroni-Maid-7B | 7.95 | 38.66 | 64.9 | 0.57 |
| GPT-3.5-Turbo | 7.94 | 50.28 | 70 | 0.57 |
| Claude-1 | 7.9 | - | 77 | - |
| Openchat-3.5 | 7.81 | 37.08 | 64.3 | 0.39 |
| Dolphin-2.6-DPO | 7.74 | 42.88 | 61.9 | 0.53 |
| Zephyr-7B-beta | 7.34 | 38.71 | 61.4 | 0.30 |
| Llama-2-70b-chat-hf | 6.86 | 51.56 | 63 | - |
| Neural-chat-7b-v3-1 | 6.84 | 43.61 | 62.4 | 0.30 |
| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
|---|---:|---:|---:|---:|---:|
| **Kunoichi-DPO-7B**|**58.4**| 45.08 | 74| 66.99| 47.52|
| **Kunoichi-DPO-v2-7B**|**58.31**| 44.85| 75.05| 65.69| 47.65|
| [Kunoichi-7B](https://huggingface.co./SanjiWatsuki/Kunoichi-7B)|57.54| 44.99| 74.86| 63.72| 46.58|
| [OpenPipe/mistral-ft-optimized-1218](https://huggingface.co./OpenPipe/mistral-ft-optimized-1218)| 56.85 | 44.74 | 75.6 | 59.89 | 47.17 |
| [Silicon-Maid-7B](https://huggingface.co./SanjiWatsuki/Silicon-Maid-7B) | 56.45| 44.74| 74.26| 61.5| 45.32|
| [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co./mlabonne/NeuralHermes-2.5-Mistral-7B) | 53.51 | 43.67 | 73.24 | 55.37 | 41.76 |
| [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co./teknium/OpenHermes-2.5-Mistral-7B) | 52.42 | 42.75 | 72.99 | 52.99 | 40.94 |
| [openchat/openchat_3.5](https://huggingface.co./openchat/openchat_3.5) | 51.34 | 42.67 | 72.92 | 47.27 | 42.51 |
| [berkeley-nest/Starling-LM-7B-alpha](https://huggingface.co./berkeley-nest/Starling-LM-7B-alpha) | 51.16 | 42.06 | 72.72 | 47.33 | 42.53 |
| [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co./HuggingFaceH4/zephyr-7b-beta) | 50.99 | 37.33 | 71.83 | 55.1 | 39.7 |