How do you compress a MOE in a single dense model ?

by AiModelsMarket - opened Apr 13

Apr 13

Hello,
what you do is wonderful! Can you tell me how do you compress a MOE in a single dense model ? Do you want to share some colab with that ? I have in mind some MOE that I would love to try to compress it myself ! Thank you if you want to share info and help ! You are the great !

jrruethe

Apr 14

I would be grateful if you could share the script you got from Charles Coddard that helped you make this, I'd love to learn more!

DataSoul

Apr 14

https://huggingface.co./thomasgauthier/Unmixtraled-22B-v0.1-expert-2

This is a 22B Mistral model recycling weights from mistral-community/Mixtral-8x22B-v0.1. The model was adapted from a Mixtral architecture to a dense Mistral architecture with the same number of layers, attention heads and hidden dimensions, and you might find it interesting.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment