leonzhou286
commited on
Commit
•
9aab7ae
1
Parent(s):
27e158f
Create README.md
Browse filesLlama 3 8b Instruct MOE
Llama 3 8b Instruct base model converted to MOE style by randomly partitioning the FFN layers of each decoder layer into 8 experts of the same size. Weights are directly taken from the llama3 instruct base model.