Update README.md

2672619 verified 8 months ago

4.1 kB

	---
	base_model:
	- elyza/ELYZA-japanese-Llama-2-7b-fast
	- elyza/ELYZA-japanese-Llama-2-7b-fast-instruct
	license: llama2
	language:
	- ja
	tags:
	- mergekit
	- merge
	- MoE
	---
	# ELYZA-japanese-Llama-2-fast-MoE-2x7B-v0.1
	[English description here](#description)


	## 概要
	Llama-2ベースの学習済み日本語モデルである[elyza/ELYZA-japanese-Llama-2-7b-fast](https://huggingface.co./elyza/ELYZA-japanese-Llama-2-7b-fast)と、そのinstruction tuningモデルである[elyza/ELYZA-japanese-Llama-2-7b-fast-instruct](https://huggingface.co./elyza/ELYZA-japanese-Llama-2-7b-fast-instruct)
	を、[mergekit](https://github.com/cg123/mergekit)を使ってMoEを行い作成したモデルです。

	[GGUF版はこちら](https://huggingface.co./Aratako/ELYZA-japanese-Llama-2-fast-MoE-2x7B-v0.1-GGUF)

	以下2モデルを利用しています。
	- [elyza/ELYZA-japanese-Llama-2-7b-fast](https://huggingface.co./elyza/ELYZA-japanese-Llama-2-7b-fast)
	- [elyza/ELYZA-japanese-Llama-2-7b-fast-instruct](https://huggingface.co./elyza/ELYZA-japanese-Llama-2-7b-fast-instruct)

	## ライセンス
	元モデルの通り、Llama2ライセンスを継承します。

	Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.

	## ベンチマーク
	ベースとしたELYZA-japanese-Llama-2-7b-fast-instructと本モデルの[japanese-mt-bench](https://github.com/Stability-AI/FastChat/tree/jp-stable/fastchat/llm_judge)の結果は以下の通りです。
	（シングルターン）
	\|Model\|Size\|Coding\|Extraction\|Humanities\|Math\|Reasoning\|Roleplay\|STEM\|Writing\|avg_score\|
	\|---\|---\|---\|---\|---\|---\|---\|---\|---\|---\|---\|
	\| ELYZA-japanese-Llama-2-7b-fast-instruct \| 7B \| 2.8 \| 5.2 \| 7.1 \| 2.0 \| 3.6 \| 6.0 \| 5.9 \| 6.4 \| 4.8750 \|
	\| This model \| 2x7B \| 3.5 \| 5.1 \| 7.5 \| 1.9 \| 3.5 \| 6.3 \| 5.9 \| 7.6 \| 5.1625 \|

	![レーダーチャート](./japanese_mt_bench.png)

	ベンチマークに使用したプロンプト
	```
	"""<s>[INST] <<SYS>>
	あなたは誠実で優秀な日本人のアシスタントです。
	<</SYS>>
	{instruction} [/INST]"""
	```
	## Description
	This model is created using MoE (Mixture of Experts) through mergekit based on [elyza/ELYZA-japanese-Llama-2-7b-fast](https://huggingface.co./elyza/ELYZA-japanese-Llama-2-7b-fast) and [elyza/ELYZA-japanese-Llama-2-7b-fast-instruct](https://huggingface.co./elyza/ELYZA-japanese-Llama-2-7b-fast-instruct).

	[Click here for the GGUF version](https://huggingface.co./Aratako/ELYZA-japanese-Llama-2-fast-MoE-2x7B-v0.1-GGUF)

	It utilizes the following two models:
	- [elyza/ELYZA-japanese-Llama-2-7b-fast](https://huggingface.co./elyza/ELYZA-japanese-Llama-2-7b-fast)
	- [elyza/ELYZA-japanese-Llama-2-7b-fast-instruct](https://huggingface.co./elyza/ELYZA-japanese-Llama-2-7b-fast-instruct)
	## License
	This model inherit the Llama2 license.
	Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.
	## Benchmark
	The results of this model and the base ELYZA-japanese-Llama-2-7b-instruct on japanese-mt-bench are as follows.
	(Single turn)
	\|Model\|Size\|Coding\|Extraction\|Humanities\|Math\|Reasoning\|Roleplay\|STEM\|Writing\|avg_score\|
	\|---\|---\|---\|---\|---\|---\|---\|---\|---\|---\|---\|
	\| ELYZA-japanese-Llama-2-7b-fast-instruct \| 7B \| 2.8 \| 5.2 \| 7.1 \| 2.0 \| 3.6 \| 6.0 \| 5.9 \| 6.4 \| 4.8750 \|
	\| This model \| 2x7B \| 3.5 \| 5.1 \| 7.5 \| 1.9 \| 3.5 \| 6.3 \| 5.9 \| 7.6 \| 5.1625 \|

	![レーダーチャート](./japanese_mt_bench.png)

	Prompt used for benchmark
	```
	"""<s>[INST] <<SYS>>
	あなたは誠実で優秀な日本人のアシスタントです。
	<</SYS>>
	{instruction} [/INST]"""
	```

	## Merge config
	[mergekit_config.yml](./mergekit_moe_config.yml)
	```yaml
	base_model: ./ELYZA-japanese-Llama-2-7b-fast-instruct
	gate_mode: random
	dtype: bfloat16
	experts:
	- source_model: ./ELYZA-japanese-Llama-2-7b-fast-instruct
	positive_prompts: []
	- source_model: ./ELYZA-japanese-Llama-2-7b-fast
	positive_prompts: []
	tokenizer_source: model:./ELYZA-japanese-Llama-2-7b-fast-instruct
	```