--- license: mit datasets: - tomg-group-umd/ContraStyles library_name: transformers --- An unofficial implementation of [CSD](https://github.com/learn2phoenix/CSD) Inspired by [vvmatorin/CSD](https://huggingface.co./vvmatorin/CSD), the difference in this implementation is that the CLIP backbone is not the openai clip class but an instance of `CLIPVisionModel`. Inference: ```python import torch import requests from PIL import Image from transformers import AutoModel, AutoConfig, AutoProcessor image_url = "https://midjourneysref.com/cdn-cgi/image/format=webp,quality=80,fit=cover/https://explore.midjourneysref.com/1541138391-4-219d8b0b" def load_image(url): im = Image.open(requests.get(url, stream=True).raw) return im processor = AutoProcessor.from_pretrained('NagaSaiAbhinay/CSD') model = AutoModel.from_pretrained('NagaSaiAbhinay/CSD', trust_remote_code=True).to("cuda") im = load_image(image_url) processed_image = processor(images=im, return_tensors='pt').to('cuda') processed_image = processed_image['pixel_values'] _, style_vector, _ = model(processed_image) ```