--- license: apache-2.0 tags: - SDXL - Text-to-Image - ControlNet - Diffusers - Stable Diffusion --- # **ControlNet++: All-in-one ControlNet for image generations and editing!** ![images_display](./images/masonry.webp) ## Network Arichitecture ![images](./images/ControlNet++.png) ## Advantages about the model - Use bucket training like novelai, can generate high resolutions images of any aspect ratio - Use large amount of high quality data(over 10000000 images), the dataset covers a diversity of situation - Use re-captioned prompt like DALLE.3, use CogVLM to generate detailed description, good prompt following ability - Use many useful tricks during training. Including but not limited to date augmentation, mutiple loss, multi resolution - Use almost the same parameter compared with original ControlNet. No obvious increase in network parameter or computation. - Support 10+ control conditions, no obvious performance drop on any single condition compared with training independently - Support multi condition generation, condition fusion is learned during training. No need to set hyperparameter or design prompts. - Compatible with other opensource SDXL models, such as BluePencilXL, CounterfeitXL. Compatible with other Lora models. ***We design a new architecture that can support 10+ control types in condition text-to-image generation and can generate high resolution images visually comparable with midjourney***. The network is based on the original ControlNet architecture, we propose two new modules to: 1 Extend the original ControlNet to support different image conditions using the same network parameter. 2 Support multiple conditions input without increasing computation offload, which is especially important for designers who want to edit image in detail, different conditions use the same condition encoder, without adding extra computations or parameters. We do thoroughly experiments on SDXL and achieve superior performance both in control ability and aesthetic score. We release the method and the model to the open source community to make everyone can enjoy it. Inference scripts and more details can found: https://github.com/xinsir6/ControlNetPlus/tree/main **If you find it useful, please give me a star, thank you very much** ## Visual Examples ### Openpose ![pose0](./images/000000_pose_concat.webp) ![pose1](./images/000001_pose_concat.webp) ![pose2](./images/000002_pose_concat.webp) ![pose3](./images/000003_pose_concat.webp) ![pose4](./images/000004_pose_concat.webp) ### Depth ![depth0](./images/000005_depth_concat.webp) ![depth1](./images/000006_depth_concat.webp) ![depth2](./images/000007_depth_concat.webp) ![depth3](./images/000008_depth_concat.webp) ![depth4](./images/000009_depth_concat.webp) ### Canny ![canny0](./images/000010_canny_concat.webp) ![canny1](./images/000011_canny_concat.webp) ![canny2](./images/000012_canny_concat.webp) ![canny3](./images/000013_canny_concat.webp) ![canny4](./images/000014_canny_concat.webp) ### Lineart ![lineart0](./images/000015_lineart_concat.webp) ![lineart1](./images/000016_lineart_concat.webp) ![lineart2](./images/000017_lineart_concat.webp) ![lineart3](./images/000018_lineart_concat.webp) ![lineart4](./images/000019_lineart_concat.webp) ### AnimeLineart ![animelineart0](./images/000020_anime_lineart_concat.webp) ![animelineart1](./images/000021_anime_lineart_concat.webp) ![animelineart2](./images/000022_anime_lineart_concat.webp) ![animelineart3](./images/000023_anime_lineart_concat.webp) ![animelineart4](./images/000024_anime_lineart_concat.webp) ### Mlsd ![mlsd0](./images/000025_mlsd_concat.webp) ![mlsd1](./images/000026_mlsd_concat.webp) ![mlsd2](./images/000027_mlsd_concat.webp) ![mlsd3](./images/000028_mlsd_concat.webp) ![mlsd4](./images/000029_mlsd_concat.webp) ### Scribble ![scribble0](./images/000030_scribble_concat.webp) ![scribble1](./images/000031_scribble_concat.webp) ![scribble2](./images/000032_scribble_concat.webp) ![scribble3](./images/000033_scribble_concat.webp) ![scribble4](./images/000034_scribble_concat.webp) ### Hed ![hed0](./images/000035_hed_concat.webp) ![hed1](./images/000036_hed_concat.webp) ![hed2](./images/000037_hed_concat.webp) ![hed3](./images/000038_hed_concat.webp) ![hed4](./images/000039_hed_concat.webp) ### Pidi(Softedge) ![pidi0](./images/000040_softedge_concat.webp) ![pidi1](./images/000041_softedge_concat.webp) ![pidi2](./images/000042_softedge_concat.webp) ![pidi3](./images/000043_softedge_concat.webp) ![pidi4](./images/000044_softedge_concat.webp) ### Teed ![ted0](./images/000045_ted_concat.webp) ![ted1](./images/000046_ted_concat.webp) ![ted2](./images/000047_ted_concat.webp) ![ted3](./images/000048_ted_concat.webp) ![ted4](./images/000049_ted_concat.webp) ### Segment ![segment0](./images/000050_seg_concat.webp) ![segment1](./images/000051_seg_concat.webp) ![segment2](./images/000052_seg_concat.webp) ![segment3](./images/000053_seg_concat.webp) ![segment4](./images/000054_seg_concat.webp) ### Normal ![normal0](./images/000055_normal_concat.webp) ![normal1](./images/000056_normal_concat.webp) ![normal2](./images/000057_normal_concat.webp) ![normal3](./images/000058_normal_concat.webp) ![normal4](./images/000059_normal_concat.webp) ## Multi Control Visual Examples ### Openpose + Canny ![pose_canny0](./images/000007_openpose_canny_concat.webp) ![pose_canny1](./images/000008_openpose_canny_concat.webp) ![pose_canny2](./images/000009_openpose_canny_concat.webp) ![pose_canny3](./images/000010_openpose_canny_concat.webp) ![pose_canny4](./images/000011_openpose_canny_concat.webp) ![pose_canny5](./images/000012_openpose_canny_concat.webp) ### Openpose + Depth ![pose_depth0](./images/000013_openpose_depth_concat.webp) ![pose_depth1](./images/000014_openpose_depth_concat.webp) ![pose_depth2](./images/000015_openpose_depth_concat.webp) ![pose_depth3](./images/000016_openpose_depth_concat.webp) ![pose_depth4](./images/000017_openpose_depth_concat.webp) ![pose_depth5](./images/000018_openpose_depth_concat.webp) ### Openpose + Scribble ![pose_scribble0](./images/000001_openpose_scribble_concat.webp) ![pose_scribble1](./images/000002_openpose_scribble_concat.webp) ![pose_scribble2](./images/000003_openpose_scribble_concat.webp) ![pose_scribble3](./images/000004_openpose_scribble_concat.webp) ![pose_scribble4](./images/000005_openpose_scribble_concat.webp) ![pose_scribble5](./images/000006_openpose_scribble_concat.webp) ### Openpose + Normal ![pose_normal0](./images/000019_openpose_normal_concat.webp) ![pose_normal1](./images/000020_openpose_normal_concat.webp) ![pose_normal2](./images/000021_openpose_normal_concat.webp) ![pose_normal3](./images/000022_openpose_normal_concat.webp) ![pose_normal4](./images/000023_openpose_normal_concat.webp) ![pose_normal5](./images/000024_openpose_normal_concat.webp) ### Openpose + Segment ![pose_segment0](./images/000025_openpose_sam_concat.webp) ![pose_segment1](./images/000026_openpose_sam_concat.webp) ![pose_segment2](./images/000027_openpose_sam_concat.webp) ![pose_segment3](./images/000028_openpose_sam_concat.webp) ![pose_segment4](./images/000029_openpose_sam_concat.webp) ![pose_segment5](./images/000030_openpose_sam_concat.webp)