--- license: mit library_name: diffusers --- # Ostris VAE - KL-f8-d16 A 16 channel VAE with 8x downsample. Trained from scratch on a balance of photos, artistic, text, cartoons, vector images. It is lighter weight that most VAEs with only 57,266,643 parameters (vs SD3 VAE: 83,819,683) which means it is faster and uses less VRAM yet scores quite similarly on real images. Plus it is MIT licensed so you can do whatever you want with it. | VAE|PSNR (higher better)| LPIPS (lower better) | # params | |----|----|----|----| | sd-vae-ft-mse|26.939|0.0581|83,653,863| | SDXL|27.370|0.0540|83,653,863| | SD3|31.681|0.0187|83,819,683| | **Ostris KL-f8-d16** |**31.166**|**0.0198**|**57,266,643**| ### Compare Check out the comparison at [imgsli](https://imgsli.com/Mjc2MjA3). ### What do I do with this? If you don't know, you probably don't need this. This is made as an open source lighter version of a 16ch vae. You would need to train it into a network before it is useful. I plan to do this myself for SD 1.5, SDXL, and possibly pixart. [Follow me on Twitter](https://x.com/ostrisai) to keep up with my work on that. ### Note: Not SD3 compatable This VAE is not SD3 compatable as it is trained from scratch and has an entirely different latent space.