GLEAM-Mixtral-8x7B-Instruct
Overview
GLEAM-Mixtral-8x7B-Instruct is a experimental preference-aligned model built on top of mistralai/Mixtral-8x7B-Instruct-v0.1
.
Model Description
The model has been optimized with ORPO on a self-generated synthetic preference dataset using 300 examples from argilla/ultrafeedback-binarized-preferences
and prompts from Open-Orca/SlimOrca
.
Prompt Format
The model utilizes the standard Mixtral prompt format:
Prompt Example:
<s> [INST] {prompt-0} [/INST] {response}</s> [INST] {prompt-1} [/INST]
Benchmarks
Performance metrics are provided below, comparing GLEAM-Mixtral with the original Mixtral model against tinyBenchmarks
:
Benchmark | Mixtral (5-shot) | GLEAM-Mixtral (5-shot) |
---|---|---|
MMLU | 66.8 | 65.5 |
Hellaswag | 87.4 | 77.8 |
ARC | 69.0 | 50.6 |
WinoGrande | 80.7 | 79.5 |
Clear degradation is present in some results.
Model Alignment
Preference alignment tests show that GLEAM-Mixtral outperforms Mixtral against a preference model trained on argilla/ultrafeedback-binarized-preferences
when evaluated against prompts from the validation set of the prompt dataset:
Model | Win Rate (95% CI) |
---|---|
Mixtral | 40.89% ± 6.13% |
GLEAM-Mixtral | 59.11% ± 6.13% |
Usage
I woudn't actually recomend using this model for any practical application beyond evaluation and testing, but its a nice proof of concept.
How to Use (If you are insistent on using it)
You can access GLEAM-Mixtral-8x7B-Instruct-v2.0 via the HuggingFace API. Below is a Python snippet demonstrating how to load and use the model:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Txoka/GLEAM-Mixtral-8x7B-Instruct-v2.0"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
prompt = "[INST] Write a diary entry from the perspective of a cat who believes it's the ruler of the household. [/INST]"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- 11