boredape_diffusion / README.md
ckandemir's picture
Update README.md
e6a8832 verified
metadata
license: creativeml-openrail-m
base_model: runwayml/stable-diffusion-v1-5
instance_prompt: photo of a bayc nft
tags:
  - stable-diffusion
  - stable-diffusion-diffusers
  - text-to-image
  - diffusers
  - dreambooth
inference: true
pipeline_tag: text-to-image

DreamBooth - Bored Ape Yacht Club

Model Description

This DreamBooth model is an exquisite derivative of runwayml/stable-diffusion-v1-5, fine-tuned with an engaging emphasis on the Bored Ape Yacht Club (BAYC) NFT collection. The model's weights were meticulously honed using photos from BAYC NFTs, leveraging the innovative DreamBooth to curate a unique, text-to-image synthesis experience.

Training

Images instrumental in the model's training were generously sourced from the Covalent API, specifically via this endpoint.

Inference

Inference has been meticulously optimized, allowing for the generation of captivating, original, and unique images that resonate with the Bored Ape Yacht Club collection. This facilitates a vivid exploration of creativity, enabling the synthesis of images that seamlessly align with the distinctive aesthetics of Bored Ape NFTs.

img_0 img_1 img_2

Usage

Here’s a basic example of how you can wield this model for generating images:

import torch
from diffusers import StableDiffusionPipeline, DDIMScheduler
from transformers import CLIPTextModel
import numpy as np

model_id = "runwayml/stable-diffusion-v1-5"

unet = UNet2DConditionModel.from_pretrained("ckandemir/boredape_diffusion", subfolder="unet")
text_encoder = CLIPTextModel.from_pretrained("ckandemir/boredape_diffusion",subfolder="text_encoder")

pipeline = StableDiffusionPipeline.from_pretrained(
    model_id, unet=unet, text_encoder=text_encoder, dtype=torch.float16, use_safetensors=True
).to('cuda')
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)

prompt = ["a spiderman bayc nft"]
neg_prompt = ["realistic,disfigured face,disfigured eyes, deformed,bad anatomy"] * len(prompt)
num_samples = 3
guidance_scale = 9
num_inference_steps = 50
height = 512
width = 512

seed = np.random.randint(0, 2**20 - 1)
print("Seed: {}".format(str(seed)))
generator = torch.Generator(device='cuda').manual_seed(seed)

with autocast("cuda"), torch.inference_mode():
    imgs = pipeline(
        prompt,
        negative_prompt=neg_prompt,
        height=height, width=width,
        num_images_per_prompt=num_samples,
        num_inference_steps=num_inference_steps,
        guidance_scale=guidance_scale,
        generator=generator
    ).images

for img in imgs:
    display(img)

Further Optimization

Results can be further enhanced and refined through meticulous fine-tuning and adept modification of training parameters, unlocking an even broader spectrum of creativity and artistic expression.