Doesn't Generate `<think>` tags

#25

by bingw5 - opened 3 days ago

Discussion

bingw5

3 days ago

The response doesn't contain <think> and </think> tags, only </details>. Is this by design?

jeffwadsworth

3 days ago

Sure it does. Use llama.cpp. Command line prompt along these lines depending on your system/os: .\llama-cli --model QwQ-32B-Q8_0.gguf --temp 0.0 --color --threads 36 --ctx-size 128000

Lissanro

2 days ago

Are you using Open-WebUI by any chance? When using it with SillyTavern, it produces <think> tags for me just fine. I suggest trying QwQ-32B 8bpw EXL2 quant with TabbyAPI, using DeepSeek-R1-Distill-Qwen-1.5B-4bpw-exl2 as a draft model for speculative decoding, for the best speed and quality.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment