open-llm-leaderboard/open_llm_leaderboard · Consider filtering for MoE models

Sep 30, 2024

I want to compare MoE models with each other. This is NOT easy, because it is only possible to "hide" them. It is not possible to hide dense models or filter for them. Their naming scheme is not standardized. They are not following the [GGUF naming convention][https://github.com/ggerganov/ggml/blob/master/docs/gguf.md#gguf-naming-convention] or any other standard, as far as I am aware, so it is hard to search for them via the search feature.

Consider the following MoE model names:

allenai/OLMoE-1B-7B-0924
microsoft/GRIN-MoE
Qwen/Qwen1.5-MoE-A2.7B-Chat
Jamba-12B-52B
Qwen/Qwen1.5-3B-14B
JetMoE-2B-9B
OpenMoE-2B-9B
Arctic-17B-480B
mistralai/Mixtral-8x7B-Instruct-v0.1

Not all of them have "MoE" in the model name.

The difficulty in finding them is that the parameter count is divided into

a) activated parameters
b) total parameter count

Total parameter count and activated parameter count should not be confused with the number of experts per layers in the model and number of activated experts per layer respectively.

alozowski

Open LLM Leaderboard org Oct 1, 2024

•

edited Oct 1, 2024

Hi @ThiloteE ,

Thank you for your discussion!

Currently, if you want to analyse MoE models, you can use our Contents dataset – there is a MoE column, you need to choose MoE=false and you will be able to see all MoE models we have now

We're planning to improve the Leaderboard's UI in a future release. As part of this update, we'll consider implementing more advanced filtering options for MoE models

alozowski

Open LLM Leaderboard org Oct 14, 2024

Closing this discussion, please, feel free to ping me here in case of any questions about MoE models or start a new discussion

alozowski changed discussion status to closed Oct 14, 2024