TheBloke
/

Qwen-7B-Chat-GPTQ

Text Generation

4-bit precision

Model card Files Files and versions Community

If there will be a gguf version?

#1

by Cran-May - opened Dec 4, 2023

Dec 4, 2023

Now LLaMA.cpp supports Qwen
https://github.com/ggerganov/llama.cpp/pull/4281

Here is a series of qwen models.(The support of 1.8B model is not checked.)
https://huggingface.co./Qwen/Qwen-72B-Chat
https://huggingface.co./Qwen/Qwen-14B-Chat
https://huggingface.co./Qwen/Qwen-7B-Chat
https://huggingface.co./Qwen/Qwen-1_8B-Chat
https://huggingface.co./Qwen/Qwen-Audio-Chat
https://huggingface.co./Qwen/Qwen-VL-Chat

Maybe Internlm? https://github.com/ggerganov/llama.cpp/pull/4283

https://huggingface.co./internlm/internlm-chat-20b
https://huggingface.co./internlm/internlm-chat-7b-v1_1

Dec 5, 2023

Qwen-VL Chat seems not to be supported.

Dec 5, 2023

This comment has been hidden

Dec 8, 2023

1.8B can be quantified now
@TheBloke

Dec 8, 2023

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment