---
license: mit
datasets:
- tomg-group-umd/ContraStyles
library_name: transformers
---
An unofficial implementation of [CSD](https://github.com/learn2phoenix/CSD)

Inspired by [vvmatorin/CSD](https://huggingface.co./vvmatorin/CSD), the difference in this implementation is that the CLIP backbone is not the openai clip class but an instance of `CLIPVisionModel`.

Inference:

```python
import torch
import requests
from PIL import Image
from transformers import AutoModel, AutoConfig, AutoProcessor

image_url = "https://midjourneysref.com/cdn-cgi/image/format=webp,quality=80,fit=cover/https://explore.midjourneysref.com/1541138391-4-219d8b0b"
def load_image(url):
  im = Image.open(requests.get(url, stream=True).raw)
  return im

processor = AutoProcessor.from_pretrained('NagaSaiAbhinay/CSD')
model = AutoModel.from_pretrained('NagaSaiAbhinay/CSD', trust_remote_code=True).to("cuda")

im = load_image(image_url)
processed_image = processor(images=im, return_tensors='pt').to('cuda')
processed_image = processed_image['pixel_values']
_, style_vector, _ = model(processed_image)
```