Open-LiteVAE
Collection
Collection of Open-LiteVAE Models.
•
1 item
•
Updated
This repository contains a LiteVAE model trained with the open-litevae codebase, based on the paper "LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models" [2024].
Note: This model is intended for demonstration purposes, we do not recommend using it in production.
license: AGPL-3.0
Parameter | Value |
---|---|
Downscale Factor | 8x |
Latent Z dim | 12 |
Encoder Size (params) | B (6.2M) |
Decoder Size (params) | M (54M) |
Discriminator | UNetGAN-L |
Training Set | ImageNet-1k |
Training Resolution | 128x128 --> 256x256 |
Training Steps | 100k --> 50k |
Model | Z dim | rFID | LPIPS | PSNR | SSIM |
---|---|---|---|---|---|
SD1-VAE | 4 | 0.75 | 0.138 | 25.70 | 0.72 |
SD3-VAE | 16 | 0.22 | 0.069 | 29.59 | 0.86 |
olvf8c12 (this repo) | 12 | 0.24 | 0.084 | 28.74 | 0.84 |
# install open-litevae https://github.com/RGenDiff/open-litevae
#
from PIL import Image
import torch
import torchvision.transforms as transforms
from torchvision.utils import save_image
from omegaconf import OmegaConf
from safetensors.torch import load_file
from olvae.utils import instantiate_from_config
def load_model_from_config(config_path, ckpt_path, device=torch.device("cuda")):
config = OmegaConf.load(config_path)
sd = load_file(ckpt_path)
model = instantiate_from_config(config.model)
model.load_state_dict(sd, strict=False)
model = model.to(device).eval()
return model
# load the model
olitevae = load_model_from_config(config_path="configs/olitevaeB_im_f8c12.yaml",
ckpt_path="olitevaeB_im_f8c12.safetensors")
img_transforms = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])
# encode
image = img_transforms(Image.open(<your image>)).to(device)
latent = olitevae.encode(image.unsqueeze(0)).sample()
print(latent.shape)
# decode
y = olitevae.decode(latent)
save_image(y[0]*0.5 + 0.5, "decoded_image.png")
@inproceedings{
sadat2024litevae,
title={Lite{VAE}: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models},
author={Seyedmorteza Sadat and Jakob Buhmann and Derek Bradley and Otmar Hilliges and Romann M. Weber},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=mTAbl8kUzq}
}