FredZhang7
/

anime-anything-promptgen-v2

Text Generation

stable-diffusion

arxiv:2210.14140

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

anime-anything-promptgen-v2 / README.md

FredZhang7's picture

Fix the wording

d5c3df6 over 1 year ago

|

No virus

3.04 kB

	---
	license: creativeml-openrail-m
	language:
	- en
	widget:
	- text: 1girl, fate
	- text: 1boy, league of
	- text: 1girl, genshin
	- text: 1boy, national basketball association
	- text: 1girl, spy x
	- text: 1girl, absurdres
	tags:
	- stable-diffusion
	- anime
	- anything-v4
	- art
	- arxiv:2210.14140
	datasets:
	- FredZhang7/anime-prompts-180K
	---

	## Fast Anime PromptGen

	This model was trained on a dataset of 80,000 safe anime prompts for 3 epochs. I fetched the prompts from the [Safebooru API endpoint](https://safebooru.donmai.us/posts/random.json), but only accepted unique prompts with up_score ≥ 8 and without any [blacklisted tags](./blacklist.txt).
	I didn't release the V1 model because it often generated gibberish prompts. After trying all means to correct that behavior, I eventually figured that the cause of the gibberish prompts is not from the pipeline params, model structure or training duration, but rather from the random usernames in the training data.
	Here's the complete [prompt preprocessing algorithm](./preprocess.py).


	## Text-to-image Examples

	Prefix 1girl \| [Generated 1girl prompts](./anime_girl_settings.txt) \| Model Anything V4

	![](./anime_girls.png)

	Prefix 1boy \| [Generated 1boy prompts](./anime_boy_settings.txt) \| Model Anything V4

	![](./anime_boys.png)

	## Contrastive Search
	```
	pip install --upgrade transformers
	```
	```python
	import torch
	from transformers import GPT2Tokenizer, GPT2LMHeadModel, pipeline
	tokenizer = GPT2Tokenizer.from_pretrained('distilgpt2')
	tokenizer.add_special_tokens({'pad_token': '[PAD]'})
	model = GPT2LMHeadModel.from_pretrained('FredZhang7/anime-anything-promptgen-v2')

	prompt = r'1girl, genshin'

	# generate text using fine-tuned model
	nlp = pipeline('text-generation', model=model, tokenizer=tokenizer)

	# generate 10 samples using contrastive search
	outs = nlp(prompt, max_length=76, num_return_sequences=10, do_sample=True, repetition_penalty=1.2, temperature=0.7, top_k=4, early_stopping=True)

	print('\nInput:\n' + 100 * '-')
	print('\033[96m' + prompt + '\033[0m')
	print('\nOutput:\n' + 100 * '-')
	for i in range(len(outs)):
	# remove trailing commas and double spaces
	outs[i] = str(outs[i]['generated_text']).replace(' ', '').rstrip(',')
	print('\033[92m' + '\n\n'.join(outs) + '\033[0m\n')
	```

	Output Example:

	![](./contrastive_search.png)

	Please see [Fast GPT PromptGen](https://huggingface.co./FredZhang7/distilgpt2-stable-diffusion-v2) for more info on the pipeline parameters.


	## Awesome Tips

	- If you feel like a generated anime character doesn't show emotions, try emoticons like `;o`, `:o`, `;p`, `:d`, `:p`, and `;d` in the prompt.
	I also use `happy smirk`, `happy smile`, `laughing closed eyes`, etc. to make the characters more lively and expressive.

	- Adding `absurdres`, instead of `highres` and `masterpiece`, to a prompt can drastically increase the sharpness and resolution of a generated image.

	## Danbooru
	[Link to the Danbooru version](https://huggingface.co./FredZhang7/danbooru-tag-generator)