---
pipeline_tag: text-generation
tags:
- qwen
- qwen-2
- quantized
- 2-bit
- 3-bit
- 4-bit
- 5-bit
- 6-bit
- 8-bit
- 16-bit
- GGUF
inference: false
model_creator: MaziyarPanahi
model_name: calme-2.2-qwen2-72b-GGUF
quantized_by: MaziyarPanahi
license: other
license_name: tongyi-qianwen
license_link: https://huggingface.co./Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
---


# MaziyarPanahi/calme-2.2-qwen2-72b-GGUF

The GGUF and quantized models here are based on [MaziyarPanahi/calme-2.2-qwen2-72b](https://huggingface.co./MaziyarPanahi/calme-2.2-qwen2-72b) model

## How to download
You can download only the quants you need instead of cloning the entire repository as follows:

```
huggingface-cli download MaziyarPanahi/calme-2.2-qwen2-72b-GGUF --local-dir . --include '*Q2_K*gguf'
```

## Load GGUF models


```sh
./llama.cpp/main -m mode_name.Q2_K.gguf -p "<|im_start|>user\nJust say 1, 2, 3 hi and NOTHING else\n<|im_end|>\n<|im_start|>assistant\n" -n 1024
```


## Original README

---

# MaziyarPanahi/calme-2.2-qwen2-72b

This is a fine-tuned version of the `Qwen/Qwen2-72B-Instruct` model. It aims to improve the base model across all benchmarks.

# ⚡ Quantized GGUF

All GGUF models are available here: [MaziyarPanahi/calme-2.2-qwen2-72b-GGUF](https://huggingface.co./MaziyarPanahi/calme-2.2-qwen2-72b-GGUF)

# 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard)


|    Tasks     |Version|Filter|n-shot|Metric|Value |   |Stderr|
|--------------|------:|------|-----:|------|-----:|---|-----:|
|truthfulqa_mc2|      2|none  |     0|acc   |0.6761|±  |0.0148|

|  Tasks   |Version|Filter|n-shot|Metric|Value |   |Stderr|
|----------|------:|------|-----:|------|-----:|---|-----:|
|winogrande|      1|none  |     5|acc   |0.8248|±  |0.0107|

|    Tasks    |Version|Filter|n-shot| Metric |Value |   |Stderr|
|-------------|------:|------|-----:|--------|-----:|---|-----:|
|arc_challenge|      1|none  |    25|acc     |0.6852|±  |0.0136|
|             |       |none  |    25|acc_norm|0.7184|±  |0.0131|

|Tasks|Version|     Filter     |n-shot|  Metric   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|-----:|---|-----:|
|gsm8k|      3|strict-match    |     5|exact_match|0.8582|±  |0.0096|
|     |       |flexible-extract|     5|exact_match|0.8893|±  |0.0086|

# Prompt Template

This model uses `ChatML` prompt template:

```
<|im_start|>system
{System}
<|im_end|>
<|im_start|>user
{User}
<|im_end|>
<|im_start|>assistant
{Assistant}
````

# How to use


```python

# Use a pipeline as a high-level helper

from transformers import pipeline

messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="MaziyarPanahi/calme-2.2-qwen2-72b")
pipe(messages)


# Load model directly

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/calme-2.2-qwen2-72b")
model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-2.2-qwen2-72b")
```