THUDM/chatglm-6b · add custom onnx export config

Apr 21, 2023

currently only support for batch=1 and use_past=False;
and need to fix several model structure to support export into ONNX

K024

Apr 23, 2023

•

写了一份参考代码，你可以尝试一下 https://github.com/K024/chatglm-q

May 18, 2023

写了一份参考代码，你可以尝试一下 https://github.com/K024/chatglm-q

https://huggingface.co./K024/ChatGLM-6b-onnx-u8s8
效果不太好，算子 GPU 支持很差

用https://github.com/K024/chatglm-q/examples里面的 onnx_export.py针对int8模型导出后，使用https://huggingface.co./K024/ChatGLM-6b-onnx-u8s8 里面的代码加载，启动成功，但是问题回答不正确
环境：
centos7，cuda11.7，Python 3.10.9

运行结果：

onnx模型导出列表：

@K024 帮忙看看是什么问题

K024

May 18, 2023

@boriszhang78 数值问题挺难说的，你先确认下模型在torch下的输出，然后考虑关掉onnxruntime默认优化试试看吧