lllyasviel
/

control_v11p_sd15_depth

Model card Files Files and versions Community

patrickvonplaten commited on Apr 16, 2023

Commit

1008d5d

•

1 Parent(s): b052f13

Update README.md

Browse files

Files changed (1) hide show

README.md +1 -146

README.md CHANGED Viewed

@@ -1,146 +1 @@
----
-license: openrail
-base_model: runwayml/stable-diffusion-v1-5
-tags:
-- art
-- controlnet
-- stable-diffusion
-duplicated_from: ControlNet-1-1-preview/control_v11p_sd15_depth
----
-# Controlnet - v1.1 - *depth Version*
-**Controlnet v1.1** is the successor model of [Controlnet v1.0](https://huggingface.co/lllyasviel/ControlNet)
-and was released in [lllyasviel/ControlNet-v1-1](https://huggingface.co/lllyasviel/ControlNet-v1-1) by [Lvmin Zhang](https://huggingface.co/lllyasviel).
-This checkpoint is a conversion of [the original checkpoint](https://huggingface.co/lllyasviel/ControlNet-v1-1/blob/main/control_v11p_sd15_depth.pth) into `diffusers` format.
-It can be used in combination with **Stable Diffusion**, such as [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5).
-For more details, please also have a look at the [🧨 Diffusers docs](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/controlnet).
-ControlNet is a neural network structure to control diffusion models by adding extra conditions.
-![img](./sd.png)
-This checkpoint corresponds to the ControlNet conditioned on **depth images**.
-## Model Details
-- **Developed by:** Lvmin Zhang, Maneesh Agrawala
-- **Model type:** Diffusion-based text-to-image generation model
-- **Language(s):** English
-- **License:** [The CreativeML OpenRAIL M license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) is an [Open RAIL M license](https://www.licenses.ai/blog/2022/8/18/naming-convention-of-responsible-ai-licenses), adapted from the work that [BigScience](https://bigscience.huggingface.co/) and [the RAIL Initiative](https://www.licenses.ai/) are jointly carrying in the area of responsible AI licensing. See also [the article about the BLOOM Open RAIL license](https://bigscience.huggingface.co/blog/the-bigscience-rail-license) on which our license is based.
-- **Resources for more information:** [GitHub Repository](https://github.com/lllyasviel/ControlNet), [Paper](https://arxiv.org/abs/2302.05543).
-- **Cite as:**
-  @misc{zhang2023adding,
-    title={Adding Conditional Control to Text-to-Image Diffusion Models},
-    author={Lvmin Zhang and Maneesh Agrawala},
-    year={2023},
-    eprint={2302.05543},
-    archivePrefix={arXiv},
-    primaryClass={cs.CV}
-  }
-## Introduction
-Controlnet was proposed in [*Adding Conditional Control to Text-to-Image Diffusion Models*](https://arxiv.org/abs/2302.05543) by
-Lvmin Zhang, Maneesh Agrawala.
-The abstract reads as follows:
-*We present a neural network structure, ControlNet, to control pretrained large diffusion models to support additional input conditions.
-The ControlNet learns task-specific conditions in an end-to-end way, and the learning is robust even when the training dataset is small (< 50k).
-Moreover, training a ControlNet is as fast as fine-tuning a diffusion model, and the model can be trained on a personal devices.
-Alternatively, if powerful computation clusters are available, the model can scale to large amounts (millions to billions) of data.
-We report that large diffusion models like Stable Diffusion can be augmented with ControlNets to enable conditional inputs like edge maps, depthmentation maps, keypoints, etc.
-This may enrich the methods to control large diffusion models and further facilitate related applications.*
-## Example
-It is recommended to use the checkpoint with [Stable Diffusion v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5) as the checkpoint
-has been trained on it.
-Experimentally, the checkpoint can be used with other diffusion models such as dreamboothed stable diffusion.
-**Note**: If you want to process an image to create the auxiliary conditioning, external dependencies are required as shown below:
-1. Let's install `diffusers` and related packages:
-```
-$ pip install diffusers transformers accelerate
-```
-3. Run code:
-```python
-import torch
-import os
-from huggingface_hub import HfApi
-from pathlib import Path
-from diffusers.utils import load_image
-from PIL import Image
-import numpy as np
-from transformers import pipeline
-from diffusers import (
-    ControlNetModel,
-    StableDiffusionControlNetPipeline,
-    UniPCMultistepScheduler,
-)
-checkpoint = "lllyasviel/control_v11p_sd15_depth"
-image = load_image(
-    "https://huggingface.co/lllyasviel/control_v11p_sd15_depth/resolve/main/images/input.png"
-)
-prompt = "Stormtrooper's lecture in beautiful lecture hall"
-depth_estimator = pipeline('depth-estimation')
-image = depth_estimator(image)['depth']
-image = np.array(image)
-image = image[:, :, None]
-image = np.concatenate([image, image, image], axis=2)
-control_image = Image.fromarray(image)
-control_image.save("./images/control.png")
-controlnet = ControlNetModel.from_pretrained(checkpoint, torch_dtype=torch.float16)
-pipe = StableDiffusionControlNetPipeline.from_pretrained(
-    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16
-)
-pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
-pipe.enable_model_cpu_offload()
-generator = torch.manual_seed(0)
-image = pipe(prompt, num_inference_steps=30, generator=generator, image=control_image).images[0]
-image.save('images/image_out.png')
-```
-![bird](./images/input.png)
-![bird_canny](./images/control.png)
-![bird_canny_out](./images/image_out.png)
-## Other released checkpoints v1-1
-The authors released 14 different checkpoints, each trained with [Stable Diffusion v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)
-on a different type of conditioning:
-| Model Name | Control Image Overview| Control Image Example | Generated Image Example |
-|---|---|---|---|
-TODO
-### Training
-TODO
-### Blog post
-For more information, please also have a look at the [Diffusers ControlNet Blog Post](https://huggingface.co/blog/controlnet).


1	+ This model has been deleted as it was incorrectly uploaded. The corrected model can be find under [this link](https://huggingface.co/lllyasviel/control_v11f1p_sd15_depth)