OPEA/DeepSeek-V2-Lite-int4-sym-inc

Model Details

This model is an int4 model with group_size 64 and symmetric quantization of deepseek-ai/DeepSeek-V2-Lite generated by intel/auto-round. Please follow the license of the origin model

INT4 Inference(CPU/HPU/CUDA)

from auto_round import AutoRoundConfig  ##must import for auto-round format
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

quantized_model_dir = "OPEA/DeepSeek-V2-Lite-int4-sym-inc"


tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(quantized_model_dir, 
                                             trust_remote_code=True, 
                                             torch_dtype=torch.float16,
                                             device_map="auto")
model.generation_config = GenerationConfig.from_pretrained(quantized_model_dir)
model.generation_config.pad_token_id = model.generation_config.eos_token_id
prompt = "There is a girl who likes adventure,"

messages = [
    {"role": "user", "content": prompt}
]
input_tensor = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
outputs = model.generate(input_tensor.to(model.device), max_new_tokens=200,do_sample=False)

result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True)
print(result)

prompt = "9.11和9.8哪个数字大"
## INT4
""" 9.11 和 9.8 都是数字，我们可以直接比较它们的大小。在数字序列中，9.11 大于 9.8，因为 9.11 的数值更大。所以，9.11 比 9.8 大。"""
## BF16
""" 9.11和9.8都是小数，比较它们的大小，我们可以直接比较整数部分和小数部分。

整数部分：9 和 9
小数部分：11 和 8

由于整数部分相同，我们只需要比较小数部分。11 大于 8，所以 9.11 大于 9.8。

因此，9.11 比 9.8 大。"""

prompt = "strawberry单词中有几个字母r"
##INT4
""" 
单词 "strawberry" 中有4个字母 "r"。
"""
##BF16
""" 
单词 "strawberry" 中有四个字母 "r"。
"""


prompt = "There is a girl who likes adventure,"
##INT4
"""
 and she goes on many adventures.

She climbs mountains,

jumps off cliffs,

and swims with sharks.

She is brave and strong,

and she never gives up.

She makes new friends on her adventures,

and learns new things about the world.

She is always ready for her next adventure,

and she can't wait to see what the future holds."""

##BF16
""" and she goes on many adventures.

She climbs mountains,

jumps over rivers,

and explores the deep, dark woods.

She makes friends with animals,

and learns about nature's secrets.

She is brave and strong,

and never gives up.

The girl with the adventurous spirit

lives life to the fullest,

and always has a story to tell.
"""

prompt = "Once upon a time,
## INT4
""" Once upon a time, in a land far, far away, there was a kingdom filled with magic and wonder. The kingdom was ruled by a wise and kind-hearted king and queen who ruled with love and fairness. They had three children, a prince, a princess, and a mischievous little fairy named Tilly.

The kingdom was a place of beauty and enchantment, with lush forests, sparkling rivers, and towering mountains. The people of the kingdom were happy and prosperous, living in harmony with nature and each other.

One day, a terrible storm swept through the kingdom, bringing with it a dark and mysterious force. The storm was so powerful that it swept away the king and queen, leaving the three children orphaned and alone.

The children were devastated by the loss of their parents, but they knew they had to find a way to save their kingdom from the dark force that had taken control. They set out on a journey to find the ancient and powerful Crystal of"""


##BF16
""" Once upon a time, in a land far, far away, there was a kingdom filled with magic and wonder. The kingdom was ruled by a wise and kind king and queen who were loved and respected by all their subjects.

In this kingdom, there lived a brave and adventurous prince named Leo. He was known for his kindness, intelligence, and a heart full of courage. Leo had a loyal companion, a wise old owl named Athena, who had been his mentor since he was a little boy.

One day, the kingdom was struck by a terrible curse. A dark sorcerer, who was envious of the king's power and the people's happiness, cast a spell that turned the kingdom into a land of eternal night. The once-beautiful kingdom was now shrouded in darkness, and the people were filled with despair.

Determined to save his people and restore the kingdom to its former glory, Prince Leo embarked on a quest to find the sorcerer and"""

Evaluate the model

pip3 install lm-eval==0.4.5.

auto-round --model "OPEA/internlm2_5-7b-chat-int4-sym-inc" --eval --eval_bs 16  --tasks leaderboard_ifeval,leaderboard_mmlu_pro,gsm8k,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,cmmlu,ceval-valid

Metric	BF16	INT4
Avg	0.5874	0.5727
leaderboard_mmlu_pro 5 shots	0.2847	0.2788
leaderboard_ifeval	0.3605=(0.4233+0.2976)/2	0.3327=(0.4029+0.2625)/2
cmmlu	0.6206	0.6099
ceval-valid	0.5944	0.5795
gsm8k 5 shots	0.6581	0.6126
lambada_openai	0.7050	0.6996
hellaswag	0.6260	0.6125
winogrande	0.7174	0.7214
piqa	0.7965	0.7943
truthfulqa_mc1	0.3513	0.3378
openbookqa	0.3480	0.3280
boolq	0.8306	0.8287
arc_easy	0.8026	0.7942
arc_challenge	0.5282	0.4881

Generate the model

Here is a sample command to generate the model. As shown above, we observed a slightly larger accuracy drop on some tasks. You may consider experimenting with alternative algorithms or setting the group size to 32.

auto-round \
--model deepseek-ai/DeepSeek-V2-Lite \
--device 0 \
--group_size 64 \
--nsamples 512 \
--bits 4 \
--iter 1000 \
--disable_eval \
--format 'auto_gptq,auto_round' \
--output_dir "./tmp_autoround"

Ethical Considerations and Limitations

The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.

Therefore, before deploying any applications of the model, developers should perform safety testing.

Caveats and Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Here are a couple of useful links to learn more about Intel's AI software:

Intel Neural Compressor link

Disclaimer

The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.

Cite

@article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }

arxiv github

OPEA
/

DeepSeek-V2-Lite-int4-sym-inc