about pad_token_id

by ToughStone - opened Jun 1, 2023

Jun 1, 2023

I got an error in loading the model:
size mismatch for model.decoder.embed_positions.weight: copying a param with shape torch.Size([1026, 768]) from checkpoint, the shape in current model is torch.Size([1025, 768]).
When creating the position embedded layer, the dimension is set to 1024+pad_ token_ id+1. In chinese vocabulary, pad_ token_ id=0, while in english it is 1. Where the problem is?

Fudan NLP org Jun 10, 2023

How did you load the model and tokenizer? These should be both loaded from bart-base-chinese.

yf changed discussion status to closed Sep 9, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment