--- language: - en license: apache-2.0 library_name: transformers tags: - mistral - mixtral - moe model_name: Mixtral 8X7B - bnb 4-bit inference: false model_type: mixtral pipeline_tag: text-generation quantized_by: ybelkada --- # Mixtral 8x7B Instruct-v0.1 - `bitsandbytes` 4-bit This repository contains the bitsandbytes 4-bit quantized version of [`mistralai/Mixtral-8x7B-Instruct-v0.1`](https://huggingface.co./mistralai/Mixtral-8x7B-Instruct-v0.1). To use it, make sure to have the latest version of `bitsandbytes` and `transformers` installed from source: Loading this model as such: will directly load the quantized model in 4-bit precision. ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "ybelkada/Mixtral-8x7B-Instruct-v0.1-bnb-4bit" model = AutoModelForCausalLM.from_pretrained(model_id) ``` Note you need a CUDA-compatible GPU device to run low-bit precision models with `bitsandbytes`