preprocessor_config.json missing
Hello,
is the code for the portation of the model available somewhere? I am running into problems when trying to retrieve the processor of the model. Any assist or pointers towards relevant code would be helpful!
CODE:
self.processor = VisionTextDualEncoderProcessor.from_pretrained(model_name)
ERROR:
.venv/lib/python3.10/site-packages/transformers/utils/hub.py", line 463, in cached_file
raise EnvironmentError(
OSError: calpt/CLIP-ViT-H-14-frozen-xlm-roberta-large-laion5B-s13B-b90k does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co./calpt/CLIP-ViT-H-14-frozen-xlm-roberta-large-laion5B-s13B-b90k/main' for available files.
ALTERNATIVE:
I also tried initializing the processor from:
self.tokenizer = AutoTokenizer.from_pretrained(MODEL.pretrained_dual_text)
self.image_processor = AutoFeatureExtractor.from_pretrained(MODEL.pretrained_dual_image)
self.processor = VisionTextDualEncoderProcessor(self.image_processor, self.tokenizer)
with
pretrained_dual_text = "xlm-roberta-large"
pretrained_dual_image = "laion/CLIP-ViT-H-14-laion2B-s32B-b79K"
which resulted in internal shape error which I assume comes from the fact that I am not using the correct preprocessors for the task?
.venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [32, 150528]
Hey,
you can find the code for porting the model from OpenCLIP here: https://gist.github.com/calpt/8e3555bd11f1916b5169c8125117e5ee
This repo only contains the model checkpoints without tokenizer config or preprocessor config. The correct tokenizer/ preprocessor to use would be the following:
- tokenizer:
xlm-roberta-large
- preprocessor:
laion/CLIP-ViT-H-14-laion2B-s32B-b79K