npc0
/

chatglm3-6b-32k-fp16

Model card Files Files and versions Community

ChatGLM3-6B-32k-fp16

介绍 (Introduction) ChatGLM3-6B-32k 是 ChatGLM 系列最新一代的开源模型，THUDM/chatglm3-6b

用 ChatGLM.CPP 基於 GGML quantize 生成 f16 權重 weights 儲存於此倉庫。

Performance

Model	GGML quantize method	HDD size
chatglm3-32k-ggml-q4_0.bin	f16	12.5 GB

Getting Started

Install dependency

pip install chatglm-cpp transformers

Download weight

wget https://huggingface.co./npc0/chatglm3-6b-32k-fp16/resolve/main/chatglm3-32k-ggml-f16.bin

Code

import chatglm_cpp

pipeline = chatglm_cpp.Pipeline("./chatglm3-32k-ggml-f16.bin")
pipeline.chat([chatglm_cpp.ChatMessage(role="user", content="你好")])
# Output: ChatMessage(role="assistant", content="你好👋！我是人工智能助手 ChatGLM-6B，很高兴见到你，欢迎问我任何问题。", tool_calls=[])

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including npc0/chatglm3-6b-32k-fp16

GLM-3-ggml

Quantized checkpoints ggml/gguf • 5 items • Updated Jun 30, 2024