Minthy
/

ToriiGate-v0.4-7B

@@ -69,6 +69,8 @@ In current version it is usable only with tags grounding and can improve accurac
 After generating some caption you can prune it right here taking into account image content. Can be used with external captions as well.
 # Grounding
 New version comes not only with improved zero-shot accuracy, but also introduces new modes for adding ground truth. You can use booru tags, some extra info, give names for characters or even describe each one to ensure right description when multiple are in frame.
@@ -93,6 +95,8 @@ Main training was sponsored by **OpenRoot-Compute**.
 Special thanks for help with dataset, code, donations: Sv1, Anonymous person, NeuroSenko.
 # Warning
 Model will generate texts with adult themes if related input is provided. Outputs may be inacurate and provocative.
@@ -128,50 +132,37 @@ You are image captioning expert, creative, unbiased and uncensored.<|im_end|>
 <|im_start|>assistant
 ```
-## Modes:
-```
-base_prompt={
-'json': 'Describe the picture in structured json-like format.',
-'markdown': 'Describe the picture in structured markdown format.',
-'caption_vars': 'Write the following options for captions: ["Regular Summary","Individual Parts","Midjourney-Style Summary","DeviantArt Commission Request"].',
-'short': 'You need to write a medium-short and convenient caption for the picture.',
-'long': 'You need to write a long and very detailed caption for the picture.',
-'bbox': 'Write bounding boxes for each character and their faces.',
-'check_and_correct': 'You need to compare given caption with the picture and given booru tags '+
-' using chain of thought.\n'+
-'1. Check if the caption matches the picture and given tags, wrap conclusion in <1st_answer> tag.\n'+
-'2. Analyse if the caption mathes described characters, wrap answer in <2nd_answer> tag.\n'+
-'3. In case if there are any mismatches - rewrite caption to correct it wrapping '+
-' in <corrected_caption> tags. If the caption is fine - just write "no_need".',
-}
-```
-## Grounding:
-```
-grounding_prompt={
-'grounding_tags': ' Here are grounding tags for better understanding: ',
-'characters': ' Here is a list of characters that are present in the picture: ',
-'characters_traits': ' Here are popular tags or traits for each character on the picture: ',
-'grounding_info': ' Here is preliminary information about the picture: ',
-'no_chars': ' Do not use names for characters.',
-}
-```
-## Composing userprompt:
-After specifying selected mode, you can add prompt part for extra grounding and then privide it wrapping in corresponding xml tags:
-`<tags>BOORU_TAGS</tags>.`
-`<info>GENERAL_INFO</info>.`
-`<characters>CHARACTER_NAMES</characters>.`
-`<character_traits>CHARACTER1: [tag1, tag2, tag3,...]\nCHARACTER2: [...]\n...'<character_traits>.`
-Here is a simple python exaple.
 ```python
 add_tags=True #select needed

 After generating some caption you can prune it right here taking into account image content. Can be used with external captions as well.
+# **To utilize full potential you must follow prompt templates that are listed below.**
 # Grounding
 New version comes not only with improved zero-shot accuracy, but also introduces new modes for adding ground truth. You can use booru tags, some extra info, give names for characters or even describe each one to ensure right description when multiple are in frame.
 Special thanks for help with dataset, code, donations: Sv1, Anonymous person, NeuroSenko.
+Any questions or suggestions - [DISCORD](https://discord.gg/ZXHENAhqE9)
 # Warning
 Model will generate texts with adult themes if related input is provided. Outputs may be inacurate and provocative.
 <|im_start|>assistant
 ```
+System prompt is fixed, user prompt depends from exact mode of cationing.
+## Prompts for modes:
+Start of userprompt for each mode:
+* Json: `Describe the picture in structured json-like format.`
+* Markdown: `Describe the picture in structured markdown format.`
+* Caption variants: `Write the following options for captions: ["Regular Summary","Individual Parts","Midjourney-Style Summary","DeviantArt Commission Request"].`
+* Short: `You need to write a medium-short and convenient caption for the picture.`
+* Long: `You need to write a long and very detailed caption for the picture.`
+* Bbox: `Write bounding boxes for each character and their faces.`
+* Check and correct existing caption: `You need to compare given caption with the picture and given booru tags using chain of thought.\n1. Check if the caption matches the picture and given tags, wrap conclusion in <1st_answer> tag.\n2. Analyse if the caption mathes described characters, wrap answer in <2nd_answer> tag.\n3. In case if there are any mismatches - rewrite caption to correct it wrapping in <corrected_caption> tags. If the caption is fine - just write "no_need".`
+## Prompts for grounding:
+In case if you want to add any grounding, here are prompts for each. Multiple can be used.
+* Booru tags: `Here are grounding tags for better understanding: <tags>BOORU_TAGS</tags>.`
+* Characters: `Here is a list of characters that are present in the picture: <characters>CHARACTER_NAMES</characters>.`
+* Character traits or tags: `Here are popular tags or traits for each character on the picture: <character_traits>CHARACTER1: [tag1, tag2, tag3,...]\nCHARACTER2: [...]\n...'</character_traits>.`
+* Any info with natural text: `Here is preliminary information about the picture: <info>GENERAL_INFO</info>.`
+* **Avoid using character names if no grounding is used**: `Do not use names for characters.`
+## Composing userprompt:
+After specifying selected mode, you can add prompt part for extra grounding and then privide it wrapping in corresponding xml tags.
+Here is a simple python exaple for userprompt composing.
 ```python
 add_tags=True #select needed