GGUF importance matrix (imatrix) quants for https://huggingface.co./jondurbin/bagel-dpo-34b-v0.5
The importance matrix was trained for 100K tokens (200 batches of 512 tokens) using wiki.train.raw.
The imatrix is being used on the K-quants as well (below Q6_K).
Generated with llama.cpp commit f87f7b89

Layers	Context	Template
60	200000	[INST] <<SYS>> {instructions} <</SYS>> {prompt} [/INST] {response}

GGUF

Model size

34.4B params

Architecture

llama

2-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples

Inference API (serverless) does not yet support gguf models for this pipeline type.

Model tree for dranger003/bagel-dpo-34b-v0.5-iMat.GGUF

Base model

Quantized

(3)

this model