dnotitia
/

Llama-DNA-1.0-8B-Instruct-GGUF

+---
+language:
+- en
+- ko
+license: cc-by-nc-4.0
+tags:
+- dnotitia
+- nlp
+- llm
+- slm
+- conversation
+- chat
+base_model:
+- meta-llama/Meta-Llama-3.1-8B
+library_name: transformers
+pipeline_tag: text-generation
+---
+# DNA 1.0 8B Instruct
+<br>
+<p align="center">
+<img src="assets/dna-logo.png" width="400" style="margin: 40px auto;">
+</p>
+<br>
+## Introduction
+We introduce **DNA 1.0 8B Instruct**, a state-of-the-art (**SOTA**) bilingual language model optimized for both Korean and English languages, developed and released by **Dnotitia Inc.** This model is based on the Llama architecture and has been meticulously enhanced through various advanced training techniques to excel in language understanding and generation tasks.
+The DNA 1.0 8B Instruct model has undergone a sophisticated development process:
+- **Model Merging via SLERP:** Combined with Llama 3.1 8B Instruct using spherical linear interpolation to enhance performance.
+- **Knowledge Distillation (KD):** Utilizing Llama 3.1 405B as the teacher model to improve knowledge representation.
+- **Continual Pre-Training (CPT):** Trained on a high-quality Korean dataset to boost language capabilities.
+- **Supervised Fine-Tuning (SFT):** Aligned with human preferences through fine-tuning on curated data.
+- **Direct Preference Optimization (DPO):** Enhanced instruction-following abilities for better user interaction.
+Each model supports long-context processing of up to **131,072 tokens (128K)**, enabling it to handle extensive conversational histories and long documents effectively.
+Our documentation consists of the following sections:
+- [Evaluation](#evaluation): Experimental results of DNA 1.0 8B Instruct.
+- [Quickstart](#quickstart): A basic guide to using DNA 1.0 8B Instruct with Transformers.
+- [Quantized Models](#quantized-models): An explanation of quantized DNA 1.0 8B Instruct weights in `GGUF` format.
+- [Run Locally](#run-locally): Guides to running DNA 1.0 8B Instruct locally with `llama.cpp` and `Ollama` frameworks.
+- [Deployment](#deployment): Guides to deploying DNA 1.0 8B Instruct with `vLLM` and `SGLang` frameworks.
+<br>
+## News
+- **2024.12.10**: Released DNA 1.0 8B Instruct model. Try DNA-powered Mnemos Assistant! 👉 [Beta Open](https://request-demo.dnotitia.ai/)
+- **2024.12.15**: Released GGUF quantized versions of DNA 1.0 8B Instruct model.
+<br>
+## Evaluation
+We evaluated DNA 1.0 8B Instruct against other prominent language models of similar sizes across various benchmarks, including Korean-specific tasks and general language understanding metrics.
+<br>
+<table>
+  <tr>
+    <th>Language</th>
+    <th>Benchmark</th>
+    <th>dnotitia<br>DNA 1.0<br>8B Instruct</th>
+    <th>EXAONE 3.5<br>7.8B</th>
+    <th>Qwen 2.5<br>7B</th>
+    <th>Llama 3.1<br>8B</th>
+    <th>Mistral<br>7B</th>
+  </tr>
+  <tr>
+    <td rowspan="5">Korean</td>
+    <td>KMMLU</td>
+    <td align="center"><strong>53.26</strong></td>
+    <td align="center">45.30</td>
+    <td align="center">45.66</td>
+    <td align="center">41.66</td>
+    <td align="center">31.45</td>
+  </tr>
+  <tr>
+    <td>KMMLU-Hard</td>
+    <td align="center"><strong>29.46</strong></td>
+    <td align="center">23.17</td>
+    <td align="center">24.78</td>
+    <td align="center">20.49</td>
+    <td align="center">17.86</td>
+  </tr>
+  <tr>
+    <td>KoBEST</td>
+    <td align="center"><strong>83.40</strong></td>
+    <td align="center">79.05</td>
+    <td align="center">78.51</td>
+    <td align="center">67.56</td>
+    <td align="center">63.77</td>
+  </tr>
+  <tr>
+    <td>Belebele</td>
+    <td align="center"><strong>57.99</strong></td>
+    <td align="center">40.97</td>
+    <td align="center">54.85</td>
+    <td align="center">54.70</td>
+    <td align="center">40.31</td>
+  </tr>
+  <tr>
+    <td>CSAT QA</td>
+    <td align="center">43.32</td>
+    <td align="center">40.11</td>
+    <td align="center"><strong>45.45</strong></td>
+    <td align="center">36.90</td>
+    <td align="center">27.27</td>
+  </tr>
+  <tr>
+    <td rowspan="3">English</td>
+    <td>MMLU</td>
+    <td align="center">66.64</td>
+    <td align="center">65.27</td>
+    <td align="center"><strong>74.26</strong></td>
+    <td align="center">68.26</td>
+    <td align="center">62.04</td>
+  </tr>
+  <tr>
+    <td>MMLU Pro</td>
+    <td align="center"><strong>43.05</strong></td>
+    <td align="center">40.73</td>
+    <td align="center">42.50</td>
+    <td align="center">40.92</td>
+    <td align="center">33.49</td>
+  </tr>
+  <tr>
+    <td>GSM8K</td>
+    <td align="center"><strong>80.52</strong></td>
+    <td align="center">65.96</td>
+    <td align="center">75.74</td>
+    <td align="center">75.82</td>
+    <td align="center">49.66</td>
+  </tr>
+</table>
+- The **highest scores** are in **bold**.
+<br>
+**Evaluation Protocol**
+For easy reproduction of our evaluation results, we list the evaluation tools and settings used below:
+| Benchmark   | Evaluation Setting | Metric                              | Evaluation Tool    |
+|-------------|--------------------|-------------------------------------|--------------------|
+| KMMLU       | 5-shot             | `macro_avg` / `exact_match`         | `lm-eval-harness`  |
+| KMMLU-Hard  | 5-shot             | `macro_avg` / `exact_match`         | `lm-eval-harness`  |
+| KoBEST      | 5-shot             | `macro_avg` / `f1`                  | `lm-eval-harness`  |
+| Belebele    | 0-shot             | `accuracy`                          | `lm-eval-harness`  |
+| CSAT QA     | 0-shot             | `accuracy_normalized`               | `lm-eval-harness`  |
+| MMLU        | 5-shot             | `macro_avg` / `accuracy`            | `lm-eval-harness`  |
+| MMLU Pro    | 5-shot             | `macro_avg` / `exact_match`         | `lm-eval-harness`  |
+| GSM8K       | 5-shot             | `accuracy` / `exact_match`          | `lm-eval-harness`  |
+<br>
+## Quickstart
+We offer weights in `F32`, `F16` formats and quantized weights in `Q8_0`, `Q6_K`, `Q5_K`, `Q4_K`, `Q3_K` and `Q2_K` formats.
+You can download the GGUF weights as follows:
+```bash
+# Install huggingface_hub if not already installed
+pip install huggingface_hub
+# Download the GGUF weights
+huggingface-cli download dnotitia/Llama-DNA-1.0-8B-Instruct-GGUF \
+    --include "DNA-1.0-8B-Instruct-Q8_0.gguf" \
+    --local-dir .
+```
+<br>
+## Run Locally
+For end users, we introduce two ways to run DNA 1.0 8B Instruct model locally.
+> **Note**
+>
+> We recommend using a repetition penalty not exceeding 1.0 for better generation quality.
+### llama.cpp
+You can run DNA 1.0 8B Instruct model with `llama.cpp` as follows:
+1. Install `llama.cpp`. Please refer to the [llama.cpp repository](https://github.com/ggerganov/llama.cpp) for more details.
+2. Download DNA 1.0 8B Instruct model in GGUF format.
+```bash
+huggingface-cli download dnotitia/Llama-DNA-1.0-8B-Instruct-GGUF \
+    --include "DNA-1.0-8B-Instruct-BF16*.gguf" \
+    --local-dir .
+```
+3. Run the model with `llama.cpp` in conversational mode.
+```bash
+llama-cli -cnv -m ./DNA-1.0-8B-Instruct-BF16.gguf \
+    -p "You are a helpful assistant, Dnotitia DNA."
+```
+### Ollama
+DNA 1.0 8B Instruct model is compatible with Ollama. You can use it as follows:
+1. Install Ollama. Please refer to the [Ollama repository](https://github.com/ollama/ollama) for more details.
+2. Create a `Modelfile` for DNA 1.0 8B Instruct.
+```text
+# Model path (choose appropriate GGUF weights)
+FROM ./DNA-1.0-8B-Instruct-BF16.gguf
+# Parameter values
+PARAMETER stop "<|endoftext|>"
+PARAMETER repeat_penalty 1.0
+# PARAMETER num_ctx 131072  # if you need a long context
+# Chat template
+TEMPLATE """{{- range $i, $_ := .Messages }}
+{{- $last := eq (len (slice $.Messages $i)) 1 -}}
+{{ if eq .Role "system" }}[|system|]{{ .Content }}[|endoftext|]
+{{ continue }}
+{{ else if eq .Role "user" }}[|user|]{{ .Content }}
+{{ else if eq .Role "assistant" }}[|assistant|]{{ .Content }}[|endoftext|]
+{{ end }}
+{{- if and (ne .Role "assistant") $last }}[|assistant|]{{ end }}
+{{- end -}}"""
+# System prompt
+SYSTEM """You are a helpful assistant, Dnotitia DNA."""
+# License
+LICENSE """CC BY-NC 4.0"""
+```
+3. Convert the model to Ollama.
+```bash
+ollama create dna -f Modelfile
+```
+4. Run the model with Ollama.
+```bash
+ollama run dna
+```
+<br>
+## Limitations
+While DNA 1.0 8B Instruct demonstrates strong performance, users should be aware of the following limitations:
+- The model may occasionally generate biased or inappropriate content.
+- Responses are based on training data and may not reflect current information.
+- The model may sometimes produce factually incorrect or inconsistent answers.
+- Performance may vary depending on the complexity and domain of the task.
+- Generated content should be reviewed for accuracy and appropriateness.
+<br>
+## License
+The model is released under the [CC BY-NC 4.0 license](./LICENSE). For commercial usage inquiries, please [Contact us](https://www.dnotitia.com/contact/post-form).
+<br>
+## Citation
+If you use or discuss this model in your academic research, please cite the project to help spread awareness:
+```
+@article{dnotitiadna2024,
+  title = {Dnotitia DNA 1.0 8B Instruct},
+  author = {Jungyup Lee, Jemin Kim, Sang Park, Seungjae Lee},
+  year = {2024},
+  url = {https://huggingface.co/dnotitia/DNA-1.0-8B-Instruct},
+  version = {1.0},
+}
+```
+<br>
+## Contact
+For technical support and inquiries: [Contact us](https://www.dnotitia.com/contact/post-form)