update

Files changed (3) hide show

README.md +73 -0
config.json +16 -0
diffusion_pytorch_model.safetensors +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,73 @@

+# SD3 Controlnet softedge
+The softedge controlnet is finetuned based on SD3-medium. It is trained using 12M open source and internal e-commerce dataset, and achieve good performance on both general and e-commerce image generation. It supports preprocessors such as pidinet, hed as well as their safe mode.
+## Examples
+From left to right: pidinet preprocessor, ours with pidinet, hed preprocessor, ours with hed.
+`pidinet`|`controlnet`|`hed`|`controlnet`
+:--:|:--:|:--:|:--:
+![images)](./images/im1_1.webp) | ![images)](./images/im1_2.webp) | ![images)](./images/im1_3.webp) | ![images)](./images/im1_4.webp)
+![images)](./images/im2_1.webp) | ![images)](./images/im2_2.webp) | ![images)](./images/im2_3.webp) | ![images)](./images/im2_4.webp)
+![images)](./images/im3_1.webp) | ![images)](./images/im3_2.webp) | ![images)](./images/im3_3.webp) | ![images)](./images/im3_4.webp)
+![images)](./images/im4_1.webp) | ![images)](./images/im4_2.webp) | ![images)](./images/im4_3.webp) | ![images)](./images/im4_4.webp)
+![images)](./images/im5_1.webp) | ![images)](./images/im5_2.webp) | ![images)](./images/im5_3.webp) | ![images)](./images/im5_4.webp)
+## Usage with Diffusers
+```python
+import torch
+from diffusers.utils import load_image, check_min_version
+from diffusers.models import SD3ControlNetModel
+from diffusers import StableDiffusion3ControlNetPipeline
+from controlnet_aux import PidiNetDetector
+controlnet = SD3ControlNetModel.from_pretrained(
+    "alimama-creative/SD3-Controlnet-Softedge",torch_dtype=torch.float16
+)
+pipe = StableDiffusion3ControlNetPipeline.from_pretrained(
+    "stabilityai/stable-diffusion-3-medium-diffusers",
+    controlnet=controlnet,
+    variant="fp16",
+    torch_dtype=torch.float16,
+)
+pipe.text_encoder.to(torch.float16)
+pipe.controlnet.to(torch.float16)
+pipe.to("cuda")
+image = load_image(
+    "https://huggingface.co/alimama-creative/SD3-Controlnet-Softedge/resolve/main/images/im1_0.png"
+)
+prompt = "A dog sitting on a park bench."
+width = 1024
+height = 1024
+edge_processor = PidiNetDetector.from_pretrained('lllyasviel/Annotators')
+edge_image = edge_processor(image, detect_resolution=width, image_resolution=width)
+res_image = pipe(
+    prompt=prompt,
+    negative_prompt="deformed, distorted, disfigured, poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, mutated hands and fingers, disconnected limbs, mutation, mutated, ugly, disgusting, blurry, amputation, NSFW",
+    height=height,
+    width=width,
+    control_image=edge_image,
+    num_inference_steps=25,
+    controlnet_conditioning_scale=0.95,
+    guidance_scale=5,
+).images[0]
+res_image.save("sd3.png")
+```
+## Training Detail
+The model was trained on 12M laion2B and internal sources images with aesthetic 6+ for 20k steps at resolution 1024x1024. ControlNet with 6, 12 and 23 layers have been explored, and the 12-layer model achieves a good balance between performance and model size, so we release the 12-layer model.
+Mixed precision : FP16<br/>
+Learning rate : 1e-4<br/>
+Batch size : 256<br/>
+Timestep sampling mode : 'logit_normal'<br/>
+Loss : Flow Matching<br/>
+## LICENSE
+The model is based on SD3 finetuning; therefore, the license follows the original SD3 license.

config.json ADDED Viewed

	@@ -0,0 +1,16 @@

+{
+  "_class_name": "SD3ControlNetModel",
+  "_diffusers_version": "0.30.0",
+  "_name_or_path": "./model_hub_tmp_0/.",
+  "attention_head_dim": 64,
+  "caption_projection_dim": 1536,
+  "in_channels": 16,
+  "joint_attention_dim": 4096,
+  "num_attention_heads": 24,
+  "num_layers": 12,
+  "out_channels": 16,
+  "patch_size": 2,
+  "pooled_projection_dim": 2048,
+  "pos_embed_max_size": 192,
+  "sample_size": 128
+}

diffusion_pytorch_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4c22ff0d92562c4504dd9545ef0c0cb805d3a786f2c0e821662a5c0b82a4e255
+size 2238999304