hiyouga/Qwen-14B-Chat-LLaMAfied · eval error with LLaMA-Factory

charry2000

Dec 29, 2023

python src/evaluate.py --model_name_or_path ~/model/Qwen-14B-Chat-LLaMAfied --finetuning_type full --template llama2 --task ceval --split validation --lang zh --n_shot 5 --batch_size 1

/models/llama/modeling_llama.py", line 726, in forward
attn_output = torch.nn.functional.scaled_dot_product_attention(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: cutlassF: no kernel found to launch!

hiyouga

Owner Dec 29, 2023

Install flash attention2 may help

charry2000

Dec 29, 2023

Thanks
I run it at V100, I modified torch_dtype from bfloat16 to float16 in config.json just fine .

But encountered a new error

LLaMA-Factory/src/llmtuner/eval/evaluator.py", line 44, in
word_probs = torch.stack([logits[i, lengths[i] - 1] for i in range(len(lengths))], dim=0)
~~~~~~^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

/opt/conda/conda-bld/pytorch_1702400430266/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [176,0,0], thread: [0,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1702400430266/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [176,0,0], thread: [1,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed.
/opt/conda/conda-bld/pytorch_1702400430266/work/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [176,0,0], thread: [2,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds" failed.

hiyouga

Owner Dec 29, 2023

use --template qwen

charry2000

Dec 29, 2023

I changed to the --template qwen and it's still the same. I switched to the A100 machine and used this model normally. I just tested a fresh installation of unsloth update on this V100 server, but it looks like there are still issues with the environment.
Thanks a lot

charry2000 changed discussion status to closed Dec 29, 2023