用Deepspeed读取模型进行微调,会把内存耗尽,报错 Step 1 exited with non-zero status 247 exits with return code = -9

#3
by x-lin - opened

使用未裁剪前的模型1b1不会出现这个问题

Sign up or log in to comment