ChatGLM3-6B-32k-fp16

介绍 (Introduction) ChatGLM3-6B-32k 是 ChatGLM 系列最新一代的开源模型,THUDM/chatglm3-6b

ChatGLM.CPP 基於 GGML quantize 生成 f16 權重 weights 儲存於此倉庫。

Performance

Model GGML quantize method HDD size
chatglm3-32k-ggml-q4_0.bin f16 12.5 GB

Getting Started

  1. Install dependency
pip install chatglm-cpp transformers
  1. Download weight
wget https://huggingface.co./npc0/chatglm3-6b-32k-fp16/resolve/main/chatglm3-32k-ggml-f16.bin
  1. Code
import chatglm_cpp

pipeline = chatglm_cpp.Pipeline("./chatglm3-32k-ggml-f16.bin")
pipeline.chat([chatglm_cpp.ChatMessage(role="user", content="你好")])
# Output: ChatMessage(role="assistant", content="你好👋!我是人工智能助手 ChatGLM-6B,很高兴见到你,欢迎问我任何问题。", tool_calls=[])
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including npc0/chatglm3-6b-32k-fp16