accelerate tokenizer

#98
by lugim - opened

torch.tensor对于list of numpy.array转化速度非常慢。在SFT DataCollatorForSeq2seq调用return_tensor='pt'时,速度非常慢,所以建议使用def _pad返回数据类型为list,而不是numpy.array

lugim changed pull request status to open
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment