Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Introduction

Meissonic is a non-autoregressive mask image modeling text-to-image synthesis model that can generate high-resolution images. It is designed to run on consumer graphics cards.

Architecture

Key Features

High-resolution image generation (up to 1024x1024)
Designed to run on consumer GPUs
Versatile applications: text-to-image, image-to-image

License

This project is licensed under the Apache License, Version 2.0 (SPDX-License-Identifier: Apache-2.0).

Disclaimer

We used compliance-checking algorithms during the training process, to ensure the compliance of the trained model to the best of our ability. Due to the complexity of the data and the diversity of language model usage scenarios, we cannot guarantee that the model is completely free of copyright issues or improper content. If you believe anything infringes on your rights or generates improper content, please contact us, and we will promptly address the matter.