|
--- |
|
title: boring_e621 |
|
tags: |
|
- textual inversion embeddings |
|
- image-generation |
|
license: apache-2.0 |
|
--- |
|
|
|
# boring_e621 |
|
|
|
This embedding attempts to capture what it means for an image to be uninteresting. It was trained as a negative embedding using e621 style tags as prompts during training. |
|
If you're using the [Automatic1111 Stable Diffusion WebUI](https://github.com/AUTOMATIC1111/stable-diffusion-webui), place the .pt file in |
|
stable-diffusion-webui\embeddings and add "by boring_e621" to your negative prompt for more interesting outputs. |
|
<br> |
|
|
|
## Model Description |
|
|
|
|
|
The motivation for boring_e621 is that negative embeddings like [Bad Prompt](https://huggingface.co./datasets/Nerfgun3/bad_prompt), |
|
whose training is described [here](https://www.reddit.com/r/StableDiffusion/comments/yy2i5a/i_created_a_negative_embedding_textual_inversion/) |
|
depend on manually curated lists of tags describing features people do not want their images to have, such as "deformed hands". Some problems with this approach are: |
|
* Manually compiled lists will inevitably be incomplete. |
|
* Models might not always understand the tags well due to a dearth of training images labeled with these tags. |
|
* It can only capture named concepts. If there exist unnamed yet visually unappealing concepts that just make an image look wrong, |
|
but for reasons that cannot be succinctly explained, they will not be captured by a list of tags. |
|
<br> |
|
|
|
To address these problems, boring_e621 employs textual inversion on a set of images automatically extracted from the art site |
|
e621.net, a rich resource of millions of hand-labeled artworks, each of which is both human-labeled topically and rated |
|
according to its quality. E621.net allows users to express their approval of an artwork by either up-voting it, or marking it as a favorite. |
|
Boring_e621 was specifically trained artworks automatically selected from the site according to the criteria |
|
that no user has ever Favorited or Up-Voted them. boring_e621 thus learned to produce low-quality images, so when it is |
|
used in the negative prompt of a stable diffusion image generator, the model avoids making mistakes that would make the generation more boring. |
|
<br> |
|
|
|
# Bias, Risks, and Limitations |
|
* Using this as a negative embedding often sacrifices some fidelity to the prompt. For example, characters in the image may disappear or change eye/skin color. |
|
* Using this as a negative embedding may introduce unexpected or undesired content into the image to make it look less boring. |
|
* Unlike other negative embeddings, this is not intended to fix problems like extra limbs or deformed hands. It can be used alongside other negative embeddings to fix deformities. |
|
<br> |
|
|
|
# Evaluation |
|
|
|
To qualitatively evaluate how well boring_e621 has learned to improve image quality, we apply it to 4 simple sample prompts using the base Stable Diffusion 1.5 model. |
|
|
|
![boring_e621 and boring_e621_v4 Performance on Simple Prompts](tmpoqs1d_vv.png) |
|
|
|
As we can see, putting these embeddings in the negative prompt yields a more delicious burger, a more vibrant and detailed landscape, a prettier pharoah, and a more 3-d-looking aquarium. |
|
|
|
|
|
## Other Models |
|
|
|
Boring_e621 has been reported to work well with SD 1.4 or 1.5 models such as: |
|
* https://civitai.com/models/18208?modelVersionId=68551 |
|
* https://civitai.com/models/12979/lawlass-yiffymix-20-furry-model |
|
* https://civitai.com/models/4698/lawlass-yiff-mix |
|
* https://civitai.com/models/15503/kavka-mix |
|
* https://civitai.com/models/17649/bb95-furry-mix |
|
* https://huggingface.co./Doubleyobro/yiffy-e18 . This was the fine-tuned model used to train boring_e621. |
|
|
|
|