Size mismatch

#10
by neighborwang - opened

The same error happens like this https://github.com/huggingface/autotrain-advanced/issues/487 when i'm trying to merge my adapter (finetuned LoRA model - https://huggingface.co./neighborwang/ModeliCo-7B) into the base model (Qwen2.5-Coder-7B-Instruct ).

RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
size mismatch for base_model.model.model.embed_tokens.weight: 
copying a param with shape torch.Size([151665, 3584]) from checkpoint, 
the shape in current model is torch.Size([152064, 3584]).

I faced the same issue with Llama 3.1 but i solved it use specific transformers version, so I tried for my adapter and Qwen2.5-Coder-7B-Instruct the following transformers versions:

v4.45.1
v4.45.0
v4.44.0
v4.43.0
v4.37.0

But nothing works... I need some help. Also in the GitHub issue I mentioned above, other people are also facing this issue.

Thanks a lot in advance!

Qwen org

Hi, it appeared the the embed_tokens and the lm_head had different shapes from the base model. Please try padding the tensors from the adapter model manually or truncating the tensors from the base model.

FYI: the size of the vocabulary (151665) is different from the size of the embed_tokens and the lm_head (depending on the size of the model, for 7B, it is 152064, the vocab_size in the config.json). Normally, the size from the config.json is used.

Hi Xuancheng, thanks a lot for your answer!

I used AutoTrain, I think with that the parameters are supposed to be matched automatically. Is this a problem with AutoTrain? Or what was the reason of this mismatch?

Thx!

when using the merge_adapter param to true in AutoTrain, the model seems to be merging fine.

Ok, I will try, thank you very much Abhishek!

neighborwang changed discussion status to closed

Sign up or log in to comment