---
inference: false
---

This is our **meme captioner model**, i.e., fine-tuned LLaVA-1.5-7B, from our paper [*"Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes"*](https://arxiv.org/abs/2404.03022).

\*\*<b>Important: When we talk about generating captions here, we're referring to the model creating a concise description of the meme, including its purpose and target audience, rather than generating the text that should appear within the meme itself. </b>\*\*

To run the model follow these steps:

1. Clone our repository and navigate to LLaVA folder:
   
```
git clone https://github.com/AmirAbaskohi/Beyond-Words-A-Multimodal-Exploration-of-Persuasion-in-Memes.git
cd LLaVA
```

2. Run the following commands:

```
conda create -n llava_captioner python=3.10 -y
conda activate llava_captioner
pip3 install -e .
pip3 install transformers==4.31.0
pip3 install protobuf
```

3. Finally you can chat with the model through CLI by passing our model as the model path:
```
python3 -m llava.serve.cli  --model-path AmirHossein1378/LLaVA-1.5-7b-meme-captioner    --image-file PATH_TO_IMAGE_FILE
```


Please refer to our [GitHub repository](https://github.com/AmirAbaskohi/Beyond-Words-A-Multimodal-Exploration-of-Persuasion-in-Memes) for more information.

If you find our model useful for your research and applications, please cite our work using this BibTeX:

```
@misc{abaskohi2024bcamirs,
      title={BCAmirs at SemEval-2024 Task 4: Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes}, 
      author={Amirhossein Abaskohi and Amirhossein Dabiriaghdam and Lele Wang and Giuseppe Carenini},
      year={2024},
      eprint={2404.03022},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```