Size of token a must match the size of token b

#4
by jbgg - opened

I am pretty new to using torch and transformers, and am trying to assess bias in news articles, but am often getting this error for longer articles (with the tensor size not necessarily being 922):
"RuntimeError: The size of tensor a (922) must match the size of tensor b (512) at non-singleton dimension 1"
In my understanding the tokenizer should have already made the tokens of correct dimensions, but I am obviously wrong, and am wondering how to solve this bug in my code

Bucket Research org

Are you using the tokenizer code provided in the example? If so look at the docs here:
https://huggingface.co./docs/transformers/main/en/pad_truncation
A possible solution here is to add something similar to
max_length=512, truncation=True to your tokenizer arguments.

Sign up or log in to comment