RuntimeError: [FT][ERROR] Assertion fail: /group/30063/users/vanewu/FasterTransformer/src/fastertransformer/th_op/glm/GlmOp.h:462
在我部署好模型并成功加载后,发起问答时报错。
[FT][ERROR] [FT][ERROR] CUDA runtime error: no kernel image is available for execution on the device /group/30063/users/vanewu/FasterTransformer/src/fastertransformer/kernels/sampling_topp_kernels.cu:1077
Traceback (most recent call last):
File "demo.py", line 18, in
output_texts = model.generate(prompts, output_length=max_output_length,top_k=30, top_p=0.85, temperature=0.35, repetition_penalty=1.2, do_sample=False)
File "/lyrachatglm/lyraChatGLM/lyra_glm.py", line 145, in generate
outputs = self.model(start_ids=input_token_ids,
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1480, in _call_impl
return forward_call(*args, **kwargs)
File "/lyrachatglm/lyraChatGLM/model.py", line 589, in forward
outputs = self.model.forward(start_ids,
RuntimeError: [FT][ERROR] Assertion fail: /group/30063/users/vanewu/FasterTransformer/src/fastertransformer/th_op/glm/GlmOp.h:462
以上报错我运行的是官方提供的例子demo.py文件。请问有遇到过类似问题的吗?