README.md · turboderp/Mistral-7B-instruct-exl2 at 4596e004c68d88021e1de264bc80b9534c9b6b2e

EXL2 quants of Mistral-7B-instruct

Converted from Mistral-7B-Instruct-v0.1. This is a straight conversion, but I have modified the config.json to make the default context size 7168 tokens, since in initial testing the model becomes unstable a while after that. It's possible that sliding window attention will allow the model to use its advertised 32k-token context, but this hasn't been tested yet.

2.50 bits per weight
2.70 bits per weight
3.00 bits per weight
3.50 bits per weight
4.00 bits per weight
4.65 bits per weight
5.00 bits per weight
6.00 bits per weight

measurement.json