madebyollin patrickvonplaten commited on
Commit
97ea5f1
1 Parent(s): 9292e14

Add example for Diffusers (#4)

Browse files

- Add example for Diffusers (8e464fcbfa507e688c84216d26597c4aabf10a1e)
- fix (e9769b0ff7c0c08e4603dd1ef9423756853eaeee)
- finish example (83866b7ebabad73845b577d4c527c6b457c21820)
- final fix (3cbfc6fdea6c4bc5c5851b6e25c3e38ca193fb1d)


Co-authored-by: Patrick von Platen <[email protected]>

Files changed (1) hide show
  1. README.md +27 -1
README.md CHANGED
@@ -14,6 +14,32 @@ SDXL-VAE-FP16-Fix is the [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae)
14
  | SDXL-VAE | ✅ ![](./images/orig-fp32.png) | ⚠️ ![](./images/orig-fp16.png) |
15
  | SDXL-VAE-FP16-Fix | ✅ ![](./images/fix-fp32.png) | ✅ ![](./images/fix-fp16.png) |
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  ## Details
19
 
@@ -25,4 +51,4 @@ SDXL-VAE-FP16-Fix was created by finetuning the SDXL-VAE to:
25
  2. make the internal activation values smaller, by
26
  3. scaling down weights and biases within the network
27
 
28
- There are slight discrepancies between the output of SDXL-VAE-FP16-Fix and SDXL-VAE, but the decoded images should be close enough for most purposes.
 
14
  | SDXL-VAE | ✅ ![](./images/orig-fp32.png) | ⚠️ ![](./images/orig-fp16.png) |
15
  | SDXL-VAE-FP16-Fix | ✅ ![](./images/fix-fp32.png) | ✅ ![](./images/fix-fp16.png) |
16
 
17
+ ## 🧨 Diffusers Usage
18
+
19
+ Just load this checkpoint via `AutoencoderKL`:
20
+
21
+ ```py
22
+ import torch
23
+ from diffusers import DiffusionPipeline, AutoencoderKL
24
+
25
+ vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
26
+ pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-0.9", vae=vae, torch_dtype=torch.float16, variant="fp16", use_safetensors=True)
27
+ pipe.to("cuda")
28
+
29
+ refiner = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-refiner-0.9", vae=vae, torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
30
+ refiner.to("cuda")
31
+
32
+ n_steps = 40
33
+ high_noise_frac = 0.7
34
+
35
+ prompt = "A majestic lion jumping from a big stone at night"
36
+
37
+ image = pipe(prompt=prompt, num_inference_steps=n_steps, denoising_end=high_noise_frac, output_type="latent").images
38
+ image = refiner(prompt=prompt, num_inference_steps=n_steps, denoising_start=high_noise_frac, image=image).images[0]
39
+ image
40
+ ```
41
+
42
+ ![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/lion_refined.png)
43
 
44
  ## Details
45
 
 
51
  2. make the internal activation values smaller, by
52
  3. scaling down weights and biases within the network
53
 
54
+ There are slight discrepancies between the output of SDXL-VAE-FP16-Fix and SDXL-VAE, but the decoded images should be close enough for most purposes.