updated max_position_embeddings to be the max model input length.

Mixedbread org

Thanks @ouz-m

juliuslipp changed pull request status to merged

@ouz-m Have you tested it post merge / this PR?

@michaelfeil now had a chance to test it, works!

@juliuslipp @ouz-m @michaelfeil

I have my concerns that this does not work with Torch: https://github.com/UKPLab/sentence-transformers/issues/2873
How to reproduce:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("mixedbread-ai/deepset-mxbai-embed-de-large-v1")

Very simply, model.safetensors its embeddings.position_embeddings.weight has shape [514, 1024], which can't be loaded into a model with shape [512, 1024].

  • Tom Aarsen
Mixedbread org

Hey @tomaarsen , you are right, the original XMLRoberta was trained with 514 max pos embeddings (see here). You can find the explanation here.

@michaelfeil I think the right fix would to fix Optimum instead of changing the model config. I will look closer into that.

Sign up or log in to comment