xinsir's picture
Update README.md
ed19e55 verified
|
raw
history blame
7.19 kB
metadata
license: apache-2.0
tags:
  - SDXL
  - Text-to-Image
  - ControlNet
  - Diffusers
  - Stable Diffusion

ControlNet++: All-in-one ControlNet for image generations and editing!

images_display

Network Arichitecture

images

Advantages about the model

  • Use bucket training like novelai, can generate high resolutions images of any aspect ratio
  • Use large amount of high quality data(over 10000000 images), the dataset covers a diversity of situation
  • Use re-captioned prompt like DALLE.3, use CogVLM to generate detailed description, good prompt following ability
  • Use many useful tricks during training. Including but not limited to date augmentation, mutiple loss, multi resolution
  • Use almost the same parameter compared with original ControlNet. No obvious increase in network parameter or computation.
  • Support 10+ control conditions, no obvious performance drop on any single condition compared with training independently
  • Support multi condition generation, condition fusion is learned during training. No need to set hyperparameter or design prompts.
  • Compatible with other opensource SDXL models, such as BluePencilXL, CounterfeitXL. Compatible with other Lora models.

We design a new architecture that can support 10+ control types in condition text-to-image generation and can generate high resolution images visually comparable with midjourney. The network is based on the original ControlNet architecture, we propose two new modules to: 1 Extend the original ControlNet to support different image conditions using the same network parameter. 2 Support multiple conditions input without increasing computation offload, which is especially important for designers who want to edit image in detail, different conditions use the same condition encoder, without adding extra computations or parameters. We do thoroughly experiments on SDXL and achieve superior performance both in control ability and aesthetic score. We release the method and the model to the open source community to make everyone can enjoy it.

Inference scripts and more details can found: https://github.com/xinsir6/ControlNetPlus/tree/main

If you find it useful, please give me a star, thank you very much

Visual Examples

Openpose

pose0 pose1 pose2 pose3 pose4

Depth

depth0 depth1 depth2 depth3 depth4

Canny

canny0 canny1 canny2 canny3 canny4

Lineart

lineart0 lineart1 lineart2 lineart3 lineart4

AnimeLineart

animelineart0 animelineart1 animelineart2 animelineart3 animelineart4

Mlsd

mlsd0 mlsd1 mlsd2 mlsd3 mlsd4

Scribble

scribble0 scribble1 scribble2 scribble3 scribble4

Hed

hed0 hed1 hed2 hed3 hed4

Pidi(Softedge)

pidi0 pidi1 pidi2 pidi3 pidi4

Teed

ted0 ted1 ted2 ted3 ted4

Segment

segment0 segment1 segment2 segment3 segment4

Normal

normal0 normal1 normal2 normal3 normal4

Multi Control Visual Examples

Openpose + Canny

pose_canny0 pose_canny1 pose_canny2 pose_canny3 pose_canny4 pose_canny5

Openpose + Depth

pose_depth0 pose_depth1 pose_depth2 pose_depth3 pose_depth4 pose_depth5

Openpose + Scribble

pose_scribble0 pose_scribble1 pose_scribble2 pose_scribble3 pose_scribble4 pose_scribble5

Openpose + Normal

pose_normal0 pose_normal1 pose_normal2 pose_normal3 pose_normal4 pose_normal5

Openpose + Segment

pose_segment0 pose_segment1 pose_segment2 pose_segment3 pose_segment4 pose_segment5