--- base_model: Qwen/Qwen2-VL-7B-Instruct language: - en library_name: peft license: mit tags: - LLM - VLM - Embedding - Multimodal pipeline_tag: image-text-to-text --- ```markdown ## Model Details Instruction finetuned adapter for ABC: Acheiving Better Control of Multiomodal Embeddings using VLMs. ### Model Sources This model is trained on top of Qwen2VL-Instruct. ### Paper and Website For more information, please refer to [Website](https://tiger-ai-lab.github.io/ABC/). ## Citation ``` @misc{schneider2025abcachievingbettercontrol, title={ABC: Achieving Better Control of Multimodal Embeddings using VLMs}, author={Benjamin Schneider and Florian Kerschbaum and Wenhu Chen}, year={2025}, eprint={2503.00329}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2503.00329}, } ``` ```