AutoModel.from_pretrained error in loading state_dict
#3
by
Srymaker
- opened
same problem
I meet the same error. It seems that the text prediction head (weights and bias) shape in current transformers is [1152, 1152] while the weights the authors provided are [1536, 1152] to match the visual token output.
this version (https://github.com/huggingface/transformers/releases/tag/v4.49.0-SigLIP-2) should fix the problem