NeuralMix-2x7b

This model is a Mixure of Experts (MoE) made with mergekit (mixtral branch). It uses the following base models:

๐Ÿ’ป Usage

!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "mlabonne/NeuralMix-2x7b"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Output:

A Mixture of Experts (ME) is a neural network architecture that allows for adaptive specialization of its hidden layers. It consists of an input layer, a mixture of expert layers with a set of hidden layers, and an output layer. The expert layers have different specializations and each one is responsible for predicting the output for a particular subset of the input data. The mixture of experts uses a gating network to dynamically select the expert layer that best fits the current input data. This adaptive approach can improve the performance and generalization capabilities of the neural network. 

The Mixture of Experts model is particularly useful in situations where the data is complex, heterogeneous, or has varying structures. By enabling each expert to specialize in a particular type of input, the Mixture of Experts can learn to effectively handle diverse input data and provide more accurate predictions. 

Overall, the Mixture of Experts can be seen as a type of neural network that combines the strengths of multiple models to create a more powerful and flexible predictive tool. 
Downloads last month
16
Safetensors
Model size
12.9B params
Tensor type
BF16
ยท
FP16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including mlabonne/NeuralMix-2x7b