File size: 2,928 Bytes
3935a33
 
85e94c0
 
c6e4eff
acbd921
 
 
 
 
 
85e94c0
 
 
 
 
acbd921
d82d74a
 
3935a33
85e94c0
c6e4eff
 
767bf1b
4e4ce90
 
3da8a05
a78de63
ef63e81
 
c6e4eff
f533e5a
 
f498686
f533e5a
 
 
f498686
85e94c0
5a5ede9
 
4d00198
acbd921
 
 
85e94c0
 
 
 
 
 
 
3da8a05
85e94c0
 
 
 
 
4d00198
85e94c0
 
 
 
 
 
 
 
 
 
 
 
4d00198
 
 
f498686
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
license: creativeml-openrail-m
language:
- en
widget:
- text: 1girl, fate
- text: 1boy, league of legends
- text: 1girl, genshin impact
- text: 1boy, national basketball association
- text: 1girl, spy x
- text: 1girl, absurdres
tags:
- stable-diffusion
- anime
- anything-v4
- art
- arxiv:2210.14140
datasets:
- FredZhang7/anime-prompts-180K
---

## Fast Anime PromptGen

The main model (`pytorch_model.bin`) was trained on a dataset of **80,000** anime tags and for 3 epochs. I fetched the tags from the [Safebooru API endpoint](https://safebooru.donmai.us/posts/random.json), but only accepted the ones with **up_score ≥ 8** and without any [blacklisted tags](./blacklist.txt). 
I didn't release the V1 model because it only generated gibberish prompts. After trying all means to correct that behavior, I eventually figured that the cause of the gibberish prompts is not from the model or training duration, but rather from the random usernames in the training data. 
Here's the complete [prompt preprocessing algorithm](./preprocess.py).


Todo:
- upload Danbooru model

## Text-to-image Examples

Prefix *1girl* | [Generated *1girl* prompts](./anime_girl_settings.txt) | Model *Anything V4*

![](./anime_girls.png)

Prefix *1boy*  | [Generated *1boy* prompts](./anime_boy_settings.txt) | Model *Anything V4*

![](./anime_boys.png)

## Contrastive Search
```
pip install --upgrade transformers
```
```python
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel, pipeline
tokenizer = GPT2Tokenizer.from_pretrained('distilgpt2')
tokenizer.add_special_tokens({'pad_token': '[PAD]'})
model = GPT2LMHeadModel.from_pretrained('FredZhang7/anime-anything-promptgen')

prompt = r'1girl, genshin'

# generate text using fine-tuned model
nlp = pipeline('text-generation', model=model, tokenizer=tokenizer)

# generate 10 samples using greedy search
outs = nlp(prompt, max_length=76, num_return_sequences=10, do_sample=True, repetition_penalty=1.1, temperature=0.7, top_k=4, early_stopping=True)

print('\nInput:\n' + 100 * '-')
print('\033[96m' + prompt + '\033[0m')
print('\nOutput:\n' + 100 * '-')
for i in range(len(outs)):
    # remove trailing commas and double spaces
    outs[i] = str(outs[i]['generated_text']).replace('  ', '').rstrip(',')
print('\033[92m' + '\n\n'.join(outs) + '\033[0m\n')
```

Output Example:

![](./contrastive_search.png)

Please see [Fast GPT PromptGen](https://huggingface.co./FredZhang7/distilgpt2-stable-diffusion-v2) for more info on the pipeline parameters.


## Tips

- If you feel like a generated anime character doesn't show emotions, try emoticons like `;o`, `:o`, `;p`, `:d`, `:p`, and `;d` in the prompt.
I often use `happy smirk`, `happy smile`, `laughing closed eyes`, etc. to make the characters more lively and expressive.

- Adding `absurdres`, instead of `highres` and `masterpiece`, to a prompt tends to increase the sharpness of a generated image.