--- license: creativeml-openrail-m base_model: runwayml/stable-diffusion-v1-5 instance_prompt: photo of a bayc nft tags: - stable-diffusion - stable-diffusion-diffusers - text-to-image - diffusers - dreambooth inference: true pipeline_tag: text-to-image --- # DreamBooth - Bored Ape Yacht Club ## Model Description This DreamBooth model is an exquisite derivative of `runwayml/stable-diffusion-v1-5`, fine-tuned with an engaging emphasis on the Bored Ape Yacht Club (BAYC) NFT collection. The model's weights were meticulously honed using photos from BAYC NFTs, leveraging the innovative [DreamBooth](https://dreambooth.github.io/) technology to curate a unique, text-to-image synthesis experience. ### Training Images instrumental in the model's training were generously sourced from the Covalent API, specifically via this [endpoint](https://www.covalenthq.com/docs/api/nft/get-nft-token-ids-for-contract-with-metadata/). ### Inference Inference has been meticulously optimized, allowing for the generation of captivating, original, and unique images that resonate with the Bored Ape Yacht Club collection. This facilitates a vivid exploration of creativity, enabling the synthesis of images that seamlessly align with the distinctive aesthetics of Bored Ape NFTs. ![img_0](./image_0.png) ![img_1](./image_1.png) ![img_2](./image_2.png) ## Usage Here’s a basic example of how you can wield this model for generating images: ```python import torch from diffusers import StableDiffusionPipeline, DDIMScheduler from transformers import CLIPTextModel import numpy as np model_id = "runwayml/stable-diffusion-v1-5" unet = UNet2DConditionModel.from_pretrained("ckandemir/bayc-diffusion", subfolder="unet") text_encoder = CLIPTextModel.from_pretrained("ckandemir/bayc-diffusion",subfolder="text_encoder") pipeline = StableDiffusionPipeline.from_pretrained( model_id, unet=unet, text_encoder=text_encoder, dtype=torch.float16, use_safetensors=True ).to('cuda') pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config) prompt = ["a spiderman bayc nft"] neg_prompt = ["realistic,disfigured face,eye patch,disfigured eyes, disfigured, deformed,bad anatomy"] * len(prompt) num_samples = 3 guidance_scale = 9 num_inference_steps = 50 height = 512 width = 512 seed = np.random.randint(0, 2**20 - 1) print("Seed: {}".format(str(seed))) generator = torch.Generator(device='cuda').manual_seed(seed) with autocast("cuda"), torch.inference_mode(): imgs = pipeline( prompt, negative_prompt=neg_prompt, height=height, width=width, num_images_per_prompt=num_samples, num_inference_steps=num_inference_steps, guidance_scale=guidance_scale, generator=generator ).images for img in imgs: display(img) ``` ## Further Optimization Results can be further enhanced and refined through meticulous fine-tuning and adept modification of training parameters, unlocking an even broader spectrum of creativity and artistic expression.