--- pipeline_tag: zero-shot-image-classification --- This repository contains the models of the paper [Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality](https://huggingface.co./papers/2410.05210).