THUDM
/

chatglm-6b

Inference Endpoints

Model card Files Files and versions Community

A pure C++ implementation, support CUDA, CPU, OpenCL etc.

#17

by zhaode - opened Mar 28, 2023

zhaode

Mar 28, 2023

•

edited Mar 28, 2023

https://github.com/wangzhaode/ChatGLM-MNN

Pure C++.
Just depende MNN, support multi device and easy deploy.
Split model to 28 block to use different device.
Slim vocab from 150528 to 130528 .
Faster than Pytorch implementation.
Provide CLI and WEB demo.
Support Android device forward.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment