leonzhou286
/

llama3_8b_instruct_moe

Model card Files Files and versions Community

llama3_8b_instruct_moe / README.md

leonzhou286's picture

Upload LlamaMoEForCausalLM

23836cf verified 3 months ago

|

history blame contribute delete

323 Bytes

	---
	base_model: meta-llama/Meta-Llama-3-8B-Instruct
	language:
	- en
	license: mit
	---

	# Llama 3 8b Instruct MOE
	Llama 3 8b Instruct base model converted to MOE style by randomly partitioning the FFN layers of each decoder layer into 8 experts of the same size. Weights are directly taken from the llama3 instruct base model.