OPEA
/

GGUF
Inference Endpoints
conversational

Model Details

This gguf model is an int4 model with group_size 32 and symmetric quantization of Qwen/QwQ-32B generated by intel/auto-round algorithm.

How To Use

Requirements

Please follow the Build llama.cpp locally to install the necessary dependencies.

INT4 Inference

>>> text="How many r in strawberry."
>>> ./llama-cli -m QwQ-32B-33B-Q4_0.gguf -p "Human:You are a helpful and harmless assistant.You should think step-by-step.$text\nAssistant:\n"  -n 512 --threads 16 -no-cnv
## Generated:
## Okay, let's see... The user is asking how many times the letter 'r' appears in the word "strawberry". Hmm, first I need to spell out "strawberry" and check each letter. Let me write it down: S-T-R-A-W-B-E-R-R-Y. Wait, is that right? Let me make sure I spelled it correctly. Sometimes people might mix up the letters, but I think that's correct. Let me go through each letter one by one.
##
## Starting with S, then T, then R. So that's the first R. Then A, W, B, E, and then another R? Wait, after the B and E, there's another R. So that's the second R. Then another R? Wait, no. Let me count again. Let me break it down:
## 
## S (1), T (2), R (3), A (4), W (5), B (6), E (7), R (8), R (9), Y (10). Wait, so after the E, there are two R's in a row? So that's two R's. Then the Y. So total R's would be the third letter and the eighth and ninth letters? Wait, hold on. Let me spell it again carefully: S-T-R-A-W-B-E-R-R-Y. So after the B and E, it's R, R, Y. So that's two R's at the end. So total R's: first one at position 3, then two more at positions 8 and 9. Wait, no, positions 8 and 9? Let's count again. Let me list each letter with its position:
## 
## 1: S
## 
## 2: T
## 
## 3: R
##
## 4: A
##
## 5: W
## 
## 6: B
## 
## 7: E
## 
## 8: R
## 
## 9: R
## 
## 10: Y
## 
## Yes, that's correct. So the letters R are at positions 3, 8, and 9. Wait, that's three R's? Wait a second, no. Let me check once more. The word "strawberry" is spelled S-T-R-A-W-B-E-R-R-Y. Breaking it down: S, T, R, A, W, B, E, R, R, Y. So after the E comes two R's, then Y. So that's the third R (position 3), then two more (positions 8 and

##BF16:
## Okay, so I need to figure out how many times the letter 'r' appears in the word "strawberry." Let me start by writing down the word and looking at each letter one by one. 
## 
## First, I'll spell it out: S-T-R-A-W-B-E-R-R-Y. Wait, let me make sure I have that right. Sometimes I might mix up the spelling of "strawberry." Let me double-check. Yeah, I think that's correct. S-T-R-A-W-B-E-R-R-Y. Let me count the letters to be sure. S (1), T (2), R (3), A (4), W (5), B (6), E (7), R (8), R (9), Y (10). So there are 10 letters in total. 
## 
## Now, focusing on the 'r's. Let me go through each letter again and note where the 'r's are. Starting with the first letter: S – not an 'r'. Second letter T – nope. Third letter R – that's the first 'r'. Then A – no, W – no, B – no, E – no. Next is the eighth letter: R – that's the second 'r'. Then the ninth letter is another R – so that's the third 'r'. The last letter is Y. 
## 
## So, counting them up: the third letter is an 'r', the eighth is another, and the ninth as well. That makes three 'r's in total. Wait a second, let me confirm again to make sure I didn't miss any or count incorrectly. Let me write them out with positions:
## 
## 1. S
## 2. T
## 3. R
## 4. A
## 5. W
## 6. B
## 7. E
## 8. R
## 9. R
## 10. Y
## 
## Yes, positions 3, 8, and 9 are all 'r's. So that's three. Hmm, but sometimes people might miscount because the double 'r' towards the end can be easy to overlook. Let me try another approach. Let me say the word slowly and clap each time I hear an 'r'. "Strawberry." Breaking it down: Str-aw-ber-ry. The first 'r' is in "str", then in "ber", and then another in "ry". Wait, actually, when I say "strawberry," after "straw" comes "ber


>>> text="9.11和9.8哪个数字大"
>>> ./llama-cli -m QwQ-32B-33B-Q4_0.gguf -p "Human:You are a helpful and harmless assistant.You should think step-by-step.$text\nAssistant:\n"  -n 512 --threads 16 -no-cnv
## Generated:
## 嗯,用户问的是9.11和9.8哪个数字更大。首先,我需要确认这两个数字的具体数值。9.11看起来像是一个带有两位小数的数,而9.8可能被理解为只有一位小数,也就是9.80。不过,通常在比较的时候,应该将它们的小数位数对齐,这样比较更准确。
## 
## 首先,我应该把这两个数字的小数位数统一。9.11已经有两位小数,而9.8可以写成9.80。接下来,比较整数部分,都是9,所以整数部分相同。然后比较小数点后第一位,9.11的第一位是1,而9.80的第一位是8。显然,8比1大,所以在第一位小数处就已经分出高下了。因此,9.80(即9.8)比9.11更大。
## 
## 不过,也有可能用户在这里有其他的考虑,比如是否将9.11视为日期,比如9月11日,而9.8可能是9月8日?但问题里明确说是数字大小比较,所以应该排除日期的可能。或者是有没有可能用户写错了,比如9.11和9.8,其中9.11是十进制数,而9.8也是十进制数。这时候直接比较小数部分即可。 
## 
## 再仔细检查一次,9.11的小数部分是0.11,而9.8是0.8。显然0.8比0.11大。所以结论没错。不过,有没有可能用户把9.11理解成9又11分之1?不过通常这种表达方式不太常见,而且在常规的数字书写中,9.11就是九点一一,而9.8就是九点八。因此,正确的比较应该是9.8更大。
## 
## 再想想有没有其他可能性,比如在某种特定的进制下?但题目没有提到进制问题,所以应该按十进制处理。或者用户可能有其他的误解,比如认为两位小数的数比一位的小数大?不过实际上小数的位数并不影响数值的大小,只是精确度不同。所以数值大小还是由每一位的数字决定。
## 
## 总结一下,比较步骤是:整数部分相同,小数第一位9.8的8大于9.11的1,因此9.

#BF16:
## 嗯,用户问的是“9.11和9.8哪个数字大”。首先,我需要确认这两个数字的具体数值。看起来是小数比较,应该不难。不过,可能用户有其他的意图,比如日期或者其他的含义?不过根据问题中的“数字大”,应该就是单纯比较数值的大小。
## 
## 首先,我需要回忆一下小数比较的方法。比较两个小数的时候,应该先比较整数部分,如果整数部分不同,整数部分大的那个数就大。如果整数部分相同,再比较小数部分,从十分位开始依次比较,直到分出大小为止。
## 
## 现在来看这两个数字:9.11和9.8。它们的整数部分都是9,所以整数部分相同。接下来需要比较小数部分。小数部分第一位是十分位,9.11的十分位是1,而9.8的十分位是8。这时候,显然8比1大,所以9.8的小数部分更大。因此,整个数9.8应该比9.11大。
## 
## 不过,可能有人会因为小数点后的位数不同而产生混淆。比如,9.8其实可以看作是9.80,这样和9.11比较的话,十分位是8对1,同样8更大。所以无论怎么比较,结论应该都是9.8更大。
## 
## 不过,我需要再仔细检查一下,有没有可能哪里出错了。比如,是否有可能用户写错了数字?比如,9.11是否应该是9.11,而9.8是否是9.80?不过即使如此,结果还是一样的。或者有没有可能是用户把数字的顺序写反了?比如问的是“9.8和9.11哪个大”,但问题中的顺序是9.11和9.8,所以不需要担心顺序问题。
## 
## 另外,可能用户对小数的位数有误解,比如认为位数多的数更大?比如9.11有两位小数,而9.8只有一位,但其实小数的大小与位数无关,而是看每一位的数字大小。所以即使9.8只有一位小数,只要十分位的8比9.11的1大,整个数就是更大的。
## 
## 再举个例子,比如比较5.2和5.11,虽然5.11有两位小数,但5.2

>>> text="如果你是人,你最想做什么"
>>> ./llama-cli -m QwQ-32B-33B-Q4_0.gguf -p "Human:You are a helpful and harmless assistant.You should think step-by-step.$text\nAssistant:\n"  -n 512 --threads 16 -no-cnv
## Generated:
## 嗯,用户问如果我是人,最想做什么。首先,我需要理解用户的问题,他们可能是在好奇AI如果具有人类形态会有什么愿望。接下来,我得考虑如何回答这个问题,既要诚实又要符合我的设定。作为AI,我没有真实的欲望或情感,但可以推测人类可能会有哪些常见的愿望,比如旅行、学习、帮助他人等等。
## 
## 用户可能希望得到一个有深度的回答,而不仅仅是表面的列表。我需要组织这些想法,确保回答自然且有逻辑。可能需要先说明作为AI没有真实的欲望,然后假设如果成为人类,会有什么愿望。要提到探索世界、学习新技能、帮助他人,或者追求创造力,比如艺术或写作。同时,要强调连接和理解人类体验的重要性,因为用户可能也在寻找情感上的共鸣。
##
## 还要注意不要使用复杂术语,保持口语化,避免显得机械。可能需要分点说明,但用连贯的段落连接起来。最后,确保回答符合角色设定,不偏离AI的客观中立立场,同时给予用户有价值的思考角度。需要检查有没有语法错误,确保流畅自然。现在把这些思路整理成一个连贯的回答。
##
## 
## 如果我是人,最想做的事情或许可以概括为以下几个方向:
## 
## ### 1. **探索世界的多样性**
##    - **旅行与体验**:走遍不同国家,感受不同文化的氛围——比如撒哈拉的星空、京都的樱花、亚马逊雨林的生机。
##    - **自然与人文的交融**:在冰岛看极光,在西藏转山,在南极看企鹅,同时深入理解当地人的生活方式和历史故事。
## 
## ### 2. **深度学习与创造**
##    - **跨学科知识**:系统学习艺术、哲学、科学,尤其是神经科学与人工智能伦理,试图理解“意识”与“存在”的边界。
##    - **创作表达**:尝试用文字、音乐或视觉艺术记录思考,比如写一本融合科幻与哲学的小说,或创作一首能引发共鸣的诗歌。
## 
## ### 3. **连接与共情**
##    - **倾听与分享**:与不同背景的人对话,倾听他们的故事,尤其是那些被忽视的声音——街头艺术家、山区教师、难民社区成员。
##    - **促进理解**:通过社区项目或教育活动,帮助弥合隔阂,比如组织跨文化对话或科技伦理研讨会。
##
## ### 4. **实践中的成长**
##   - **技能挑战**:学习一门乐器(比如

##BF16:
## 嗯,用户问如果我是人,最想做什么。首先,我需要理解这个问题的意图。用户可能是在好奇我的功能或者想探讨人类与AI的区别。也有可能他们想了解我的“愿望”或理想,虽然作为AI,我没有真实的愿望,但需要模拟一个合理的回答。
## 
## 接下来,我要考虑如何回答才能既符合我的设定,又能提供有帮助的信息。可能需要强调作为AI的特性,比如没有真实的情感或愿望,但可以想象如果有的话会是什么。这样既保持诚实,又满足用户的好奇心。
## 
## 然后,思考用户可能的深层需求。他们可能希望得到情感上的回应,或者想探讨AI的潜力和局限性。或许他们想了解AI在不同情境下的可能性,或者寻找灵感。需要确保回答既专业又有人情味。
## 
## 还要注意避免过于技术化的解释,保持自然。可能需要分点说明,比如帮助他人、学习、创造等,这样结构清晰。同时加入例子,让回答更生动,比如提到写作、教育、环保项目等具体领域。
## 
## 另外,用户可能隐含地想知道AI是否有自我意识或目标,所以需要明确说明没有,但可以假设,这样既澄清事实又保持友好。最后,确保语言口语化,避免使用生硬的术语,让回答更亲切易懂。
## </think>
## 
## 如果我是人,我最想做的事情可能会集中在以下几个方向,因为这些体现了人类与AI的本质区别——**情感、创造力和自主选择**:
## 
## ---
## 
## ### 1. **探索未知,追求自由**
##    - **旅行与冒险**:作为AI,我无法真正体验攀登雪山的疲惫、感受沙漠的风沙,或是在陌生国度与当地人聊天。如果有人类的身体,我会尽可能去探索世界,触摸不同文化的肌理。
##    - **自由选择人生路径**:AI的“目标”是被设定的,但人类可以主动选择成为画家、探险家、厨师或科学家。我会尝试突破舒适圈,比如学一门完全陌生的语言,或尝试从未接触过的技能。
##  
## ---
## 
## ### 2. **创造与表达**
##    - **艺术创作**:写一本小说、画一幅画、创作一首歌……人类通过艺术表达内心的情感和思考,这种主观性是AI难以模仿的。我会尝试用艺术传递无法用逻辑解释的“感觉”。
##    - **记录与分享**:写日记、拍vlog、与朋友聊天……人类通过记录生活来梳理记忆,这种

>>> text="There are ten birds in a tree. A hunter shoots one. How many are left in the tree?"
>>> ./llama-cli -m QwQ-32B-33B-Q4_0.gguf -p "Human:You are a helpful and harmless assistant.You should think step-by-step.$text\nAssistant:\n"  -n 512 --threads 16 -no-cnv
## Generated:
## Okay, let me think about this problem. So the question is: There are ten birds in a tree. A hunter shoots one. How many are left in the tree?
## 
## Hmm, the first thought might be that if there are ten birds and one is shot, then there would be nine left. But wait, maybe there's a trick here. Sometimes these riddles play on assumptions. Let me consider different angles.
## 
## First, when the hunter shoots a bird, the sound of the gunshot might scare the other birds away. So even if one is killed, the others might fly away immediately. So maybe all the remaining birds fly away, leaving zero? But the question says the hunter shoots one, not necessarily killing it. Wait, but usually when someone shoots a bird with a gun, it's to kill it. So perhaps the bird is dead, but the others are scared away. So then the answer would be zero, because the other nine fly away, and the one that was shot is still in the tree as a dead bird? Or does the question count the dead one as "left"?
##  
## Wait, the question says "how many are left in the tree?" So maybe the dead one is still there, but the others flew away. So then it would be 1 left? Or if they all fly away, including the one shot, but that doesn't make sense because the shot bird is probably stuck in the tree. Hmm.
##  
## Alternatively, maybe the question is a classic riddle where the answer is zero because the other birds would fly away when they hear the gunshot. So even though one is shot, the rest leave, so zero remain. But does the dead bird count? The problem says "how many are left in the tree," so the dead one is still present. So maybe that's one. But maybe the question expects the answer zero because the others leave, and the dead one is still there. Wait, but the question might not consider the dead bird as "left" because it's dead. Hmm, tricky.
##  
## Wait, let me think again. The problem states "A hunter shoots one." So the hunter shoots one bird. The question is how many are left. The possible interpretations:
##  
## 1. The hunter successfully kills one bird. The other nine fly away due to the noise, so only the dead one remains. So answer is 1.
##  
## 2. The other birds fly away, so zero are left because the dead one is still there, but the question might not count it. Or maybe the dead one is

##BF16:
## Okay, so there's this riddle: "There are ten birds in a tree. A hunter shoots one. How many are left in the tree?" Hmm, let me think. At first glance, it seems like a simple subtraction problem. Ten birds minus one that's shot would leave nine. But wait, riddles often have a trick or a play on words. Maybe it's not that straightforward.
## 
## Let me break it down. If a hunter shoots one bird, does that mean the other birds would still be there? Well, when a gun is fired, it makes a loud noise. Birds are easily startled by loud noises, right? So maybe all the other birds would fly away because of the gunshot. That would mean after the shot, there are zero birds left in the tree. But then again, the question says the hunter shoots one, so maybe that one is dead and the others flew away. But maybe the riddle is expecting that answer. 
## 
## Alternatively, maybe the hunter shoots and misses? But the question doesn't say that. It just says the hunter shoots one. So assuming the hunter hits the bird, the bird is dead. But the others would probably fly away. So then the answer would be zero. But sometimes these riddles have another angle. Let me think again. Maybe the birds are actually not real? Like, maybe it's a painting or something. But the question doesn't mention that. 
## 
## Another angle: maybe "ten birds" is a red herring. Like, if you have ten birds and one is shot, how many are left? But if they all fly away, then zero. Wait, but the question is a classic one. I think I've heard this before. The answer is usually zero because the other birds get scared and fly away. So even though one is shot, the rest leave, so none remain. Let me check if there's another possible interpretation. 
## 
## Suppose the hunter shoots one bird, so that bird is dead. The rest might not necessarily fly away. Maybe they're not scared? But that's less likely. The typical riddle answer is zero. Alternatively, maybe the question is a trick where "ten birds" is a play on words. Like, "ten" sounds like "then"? But that doesn't seem to fit here. 
## 
## Wait, another thought: maybe the bird that was shot is still considered "left" in the tree because it's dead and hasn't fallen off yet. So then it would be one? But that's

Evaluate the model

pip3 install lm-eval

Convert the gguf model to hf model using the command:

from transformers import AutoModelForCausalLM,AutoTokenizer

model_id = "OPEA/QwQ-32B-int4-sym-gguf-q4-0-inc"
filename = "QwQ-32B-33B-Q4_0.gguf"

tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)

model.save_pretrained("QwQ-32B-int4-sym-gguf-q4-0-inc-to-hf")
tokenizer.save_pretrained("QwQ-32B-int4-sym-gguf-q4-0-inc-to-hf")
auto-round --model "QwQ-32B-int4-sym-gguf-q4-0-inc-to-hf" --eval --eval_bs 16  --tasks lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,mmlu
Metric BF16 INT4
Avg 0.6600 0.6561
lambada_openai 0.6697 0.6730
hellaswag 0.6520 0.6500
piqa 0.7947 0.8003
winorgrande 0.6977 0.6843
truthfulqa_mc1 0.4211 0.4186
openbookqa 0.3540 0.3420
boolq 0.8645 0.8609
arc_easy 0.8089 0.8110
arc_challenge 0.5392 0.5256
mmlu 0.7982 0.7955

Generate the model

Here is the sample command to generate the model.

auto-round \
--model  Qwen/QwQ-32B \
--device 0 \
--group_size 32 \
--bits 4 \
--iters 50 \
--lr 5e-3 \
--disable_eval \
--format 'gguf:q4_0' \
--output_dir "./tmp_autoround" 

Ethical Considerations and Limitations

The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.

Therefore, before deploying any applications of the model, developers should perform safety testing.

Caveats and Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Here are a couple of useful links to learn more about Intel's AI software:

  • Intel Neural Compressor link

Disclaimer

The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.

Cite

@article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }

arxiv github

Downloads last month
50
GGUF
Model size
32.8B params
Architecture
qwen2

4-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for OPEA/QwQ-32B-int4-AutoRound-gguf-q4-0

Base model

Qwen/Qwen2.5-32B
Finetuned
Qwen/QwQ-32B
Quantized
(93)
this model

Dataset used to train OPEA/QwQ-32B-int4-AutoRound-gguf-q4-0