File size: 2,476 Bytes
91adf40
 
5b2ce64
 
91adf40
 
 
 
 
 
5b2ce64
91adf40
5b2ce64
 
 
91adf40
5b2ce64
91adf40
5b2ce64
 
 
 
91adf40
 
 
 
5b2ce64
91adf40
5b2ce64
 
91adf40
5b2ce64
 
91adf40
 
 
 
 
5b2ce64
91adf40
5b2ce64
 
 
 
91adf40
5b2ce64
 
 
 
91adf40
5b2ce64
 
 
91adf40
5b2ce64
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
library_name: diffusers
base_model:
- stabilityai/stable-diffusion-2-1-base
---





# 🍰 Hybrid-sd-small-vae for Stable Diffusion

[Hybrid-sd-small-vae](https://huggingface.co./cqyan/hybrid-sd-small-vae) is a pruned-finetuned version VAE which uses the same "latent API" as the base model [SD-VAE](stabilityai/stable-diffusion-2-1-base).
It has smaller size, faster inference speed, as well as well-performed image generation in image saturation and image clarity compared to SD1.5. Specifically,we decreses parameters from original 83.65M to 62.01M, inferece time from 186.58ms to 135.58ms, roughly save up to 43.7% memory usage (12987MiB -> 9087MiB) without lossing T2I generation quality.
The model is useful for real-time previewing of the SD1.x generation process, and you are very welcome to try it !!!!!! 

**Index Table**

| Model | Params (M) | Decoder inference time (ms) | Decoder GPU Memory Usage (MiB) | 
|--------|-------|-------|-------|
| SD1.5 | 83.65 | 186.58 | 12987 |
| **Hybrid-sd-small-vae**| **62.014 ↓** | **135.58 ↓** | **9087 ↓** |




T2I Comparison using one A100 GPU, The image order from left to right :  [SD-VAE](stabilityai/stable-diffusion-2-1-base)  -> [Hybrid-sd-small-vae](https://huggingface.co./cqyan/hybrid-sd-small-vae)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/664afcc45fdb7108205a15c3/u8UNo7apM5eY7yCXxkjiK.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/664afcc45fdb7108205a15c3/kYbOUEFyN63CFPvmy6Gea.png)

This repo contains `.safetensors` versions of the Hybrid-sd-small-vae weights.
For SDXL, use [Hybrid-sd-small-vae-xl](https://huggingface.co./cqyan/hybrid-sd-small-vae-xl) instead (the SD and SDXL VAEs are incompatible).





## Using in 🧨 diffusers

Firstly download our repository to load the `AutoencoderKL`
```bash
git clone https://github.com/bytedance/Hybrid-SD/tree/main
```

```python
from bytenn_autoencoder_kl import AutoencoderKL
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base", torch_dtype=torch.float16
)

vae = AutoencoderKL.from_pretrained('cqyan/hybrid-sd-small-vae', torch_dtype=torch.float16)
pipe.vae = vae
pipe = pipe.to("cuda")
prompt = "A warm and loving family portrait, highly detailed, hyper-realistic, 8k resolution, photorealistic, soft and natural lighting"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("family.png")
```