This repository contains the exl2 quants of karakuri-ai/karakuri-lm-70b-v0.1, calibrated using the default dataset built by exllamav2.
Compatible with exllamav2 version 0.0.11 and later. For optimal model loading, it is recommended to use tabbyAPI.
The measurement file is attached in the branch main
and all quants are stored in their respective branches.
The chart below presents the calibration perplexity and wikitext-103-v1 test perplexity for all provided quants.
Quant | Calibration Perplexity | wikitext-103-v1 Test Perplexity |
---|---|---|
2.4bpw-h6 | 8.4726 | 7.1337 |
2.65bpw-h6 | 8.0901 | 6.5724 |
3.0bpw-h6 | 7.9927 | 6.4607 |
4.0bpw-h6 | 7.6440 | 5.8014 |
4.65bpw-h6 | 7.5872 | 5.7112 |
5.0bpw-h6 | 7.5745 | |
6.0bpw-h6 | 7.5616 | |
8.0bpw-h8 | 7.5604 |
*: This first 2.65bpw-h6 quantized version has been deprecated due to unexpectedly high wikitext-103-v1 Test Perplexity values.
Model tree for MatrixC7/karakuri-lm-70b-v0.1-exl2
Base model
meta-llama/Llama-2-70b-hf