Text-guided image-inpainting
The StableDiffusionInpaintPipeline allows you to edit specific parts of an image by providing a mask and a text prompt. It uses a version of Stable Diffusion, like runwayml/stable-diffusion-inpainting
specifically trained for inpainting tasks.
Get started by loading an instance of the StableDiffusionInpaintPipeline:
import PIL
import requests
import torch
from io import BytesIO
from diffusers import StableDiffusionInpaintPipeline
pipeline = StableDiffusionInpaintPipeline.from_pretrained(
"runwayml/stable-diffusion-inpainting",
torch_dtype=torch.float16,
use_safetensors=True,
)
pipeline = pipeline.to("cuda")
Download an image and a mask of a dog which you’ll eventually replace:
def download_image(url):
response = requests.get(url)
return PIL.Image.open(BytesIO(response.content)).convert("RGB")
img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"
init_image = download_image(img_url).resize((512, 512))
mask_image = download_image(mask_url).resize((512, 512))
Now you can create a prompt to replace the mask with something else:
prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
image = pipeline(prompt=prompt, image=init_image, mask_image=mask_image).images[0]
image |
mask_image |
prompt |
output |
---|---|---|---|
Face of a yellow cat, high resolution, sitting on a park bench |
A previous experimental implementation of inpainting used a different, lower-quality process. To ensure backwards compatibility, loading a pretrained pipeline that doesn’t contain the new model will still apply the old inpainting method.
Check out the Spaces below to try out image inpainting yourself!