|
--- |
|
license: creativeml-openrail-m |
|
language: |
|
- en |
|
widget: |
|
- text: 1girl, fate |
|
- text: 1boy, league of legends |
|
- text: 1girl, genshin impact |
|
- text: 1boy, national basketball association |
|
- text: 1girl, spy x |
|
- text: 1girl, absurdres |
|
tags: |
|
- stable-diffusion |
|
- anime |
|
- anything-v4 |
|
- art |
|
- arxiv:2210.14140 |
|
datasets: |
|
- FredZhang7/anime-prompts-180K |
|
--- |
|
|
|
## Fast Anime PromptGen |
|
|
|
The main model (`pytorch_model.bin`) was trained on a dataset of **80,000** anime tags and for 3 epochs. I fetched the tags from the [Safebooru API endpoint](https://safebooru.donmai.us/posts/random.json), but only accepted the ones with **up_score ≥ 8** and without any [blacklisted tags](./blacklist.txt). |
|
I didn't release the V1 model because it only generated gibberish prompts. After trying all means to correct that behavior, I eventually figured that the cause of the gibberish prompts is not from the model or training duration, but rather from the random usernames in the training data. |
|
Here's the complete [prompt preprocessing algorithm](./preprocess.py). |
|
|
|
|
|
Todo: |
|
- upload Danbooru model |
|
|
|
## Text-to-image Examples |
|
|
|
Prefix *1girl* | [Generated *1girl* prompts](./anime_girl_settings.txt) | Model *Anything V4* |
|
|
|
![](./anime_girls.png) |
|
|
|
Prefix *1boy* | [Generated *1boy* prompts](./anime_boy_settings.txt) | Model *Anything V4* |
|
|
|
![](./anime_boys.png) |
|
|
|
## Contrastive Search |
|
``` |
|
pip install --upgrade transformers |
|
``` |
|
```python |
|
import torch |
|
from transformers import GPT2Tokenizer, GPT2LMHeadModel, pipeline |
|
tokenizer = GPT2Tokenizer.from_pretrained('distilgpt2') |
|
tokenizer.add_special_tokens({'pad_token': '[PAD]'}) |
|
model = GPT2LMHeadModel.from_pretrained('FredZhang7/anime-anything-promptgen') |
|
|
|
prompt = r'1girl, genshin' |
|
|
|
# generate text using fine-tuned model |
|
nlp = pipeline('text-generation', model=model, tokenizer=tokenizer) |
|
|
|
# generate 10 samples using greedy search |
|
outs = nlp(prompt, max_length=76, num_return_sequences=10, do_sample=True, repetition_penalty=1.1, temperature=0.7, top_k=4, early_stopping=True) |
|
|
|
print('\nInput:\n' + 100 * '-') |
|
print('\033[96m' + prompt + '\033[0m') |
|
print('\nOutput:\n' + 100 * '-') |
|
for i in range(len(outs)): |
|
# remove trailing commas and double spaces |
|
outs[i] = str(outs[i]['generated_text']).replace(' ', '').rstrip(',') |
|
print('\033[92m' + '\n\n'.join(outs) + '\033[0m\n') |
|
``` |
|
|
|
Output Example: |
|
|
|
![](./contrastive_search.png) |
|
|
|
Please see [Fast GPT PromptGen](https://huggingface.co./FredZhang7/distilgpt2-stable-diffusion-v2) for more info on the pipeline parameters. |
|
|
|
|
|
## Tips |
|
|
|
- If you feel like a generated anime character doesn't show emotions, try emoticons like `;o`, `:o`, `;p`, `:d`, `:p`, and `;d` in the prompt. |
|
I often use `happy smirk`, `happy smile`, `laughing closed eyes`, etc. to make the characters more lively and expressive. |
|
|
|
- Adding `absurdres`, instead of `highres` and `masterpiece`, to a prompt tends to increase the sharpness of a generated image. |