accelerate tokenizer

#98

by lugim - opened Sep 13, 2023

←

lugim

Sep 13, 2023

•

torch.tensor对于list of numpy.array转化速度非常慢。在SFT DataCollatorForSeq2seq调用return_tensor='pt'时，速度非常慢，所以建议使用def _pad返回数据类型为list，而不是numpy.array

lugim changed pull request status to open Sep 14, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment