Update README.md
Browse files
README.md
CHANGED
@@ -69,6 +69,8 @@ In current version it is usable only with tags grounding and can improve accurac
|
|
69 |
|
70 |
After generating some caption you can prune it right here taking into account image content. Can be used with external captions as well.
|
71 |
|
|
|
|
|
72 |
# Grounding
|
73 |
|
74 |
New version comes not only with improved zero-shot accuracy, but also introduces new modes for adding ground truth. You can use booru tags, some extra info, give names for characters or even describe each one to ensure right description when multiple are in frame.
|
@@ -93,6 +95,8 @@ Main training was sponsored by **OpenRoot-Compute**.
|
|
93 |
|
94 |
Special thanks for help with dataset, code, donations: Sv1, Anonymous person, NeuroSenko.
|
95 |
|
|
|
|
|
96 |
# Warning
|
97 |
|
98 |
Model will generate texts with adult themes if related input is provided. Outputs may be inacurate and provocative.
|
@@ -128,50 +132,37 @@ You are image captioning expert, creative, unbiased and uncensored.<|im_end|>
|
|
128 |
<|im_start|>assistant
|
129 |
```
|
130 |
|
131 |
-
|
132 |
|
133 |
-
|
134 |
-
base_prompt={
|
135 |
-
'json': 'Describe the picture in structured json-like format.',
|
136 |
-
'markdown': 'Describe the picture in structured markdown format.',
|
137 |
-
'caption_vars': 'Write the following options for captions: ["Regular Summary","Individual Parts","Midjourney-Style Summary","DeviantArt Commission Request"].',
|
138 |
-
'short': 'You need to write a medium-short and convenient caption for the picture.',
|
139 |
-
'long': 'You need to write a long and very detailed caption for the picture.',
|
140 |
-
'bbox': 'Write bounding boxes for each character and their faces.',
|
141 |
-
'check_and_correct': 'You need to compare given caption with the picture and given booru tags '+
|
142 |
-
' using chain of thought.\n'+
|
143 |
-
'1. Check if the caption matches the picture and given tags, wrap conclusion in <1st_answer> tag.\n'+
|
144 |
-
'2. Analyse if the caption mathes described characters, wrap answer in <2nd_answer> tag.\n'+
|
145 |
-
'3. In case if there are any mismatches - rewrite caption to correct it wrapping '+
|
146 |
-
' in <corrected_caption> tags. If the caption is fine - just write "no_need".',
|
147 |
-
}
|
148 |
-
```
|
149 |
|
150 |
-
|
151 |
|
152 |
-
|
153 |
-
|
154 |
-
|
155 |
-
|
156 |
-
|
157 |
-
|
158 |
-
|
159 |
-
}
|
160 |
-
```
|
161 |
|
162 |
-
## Composing userprompt:
|
163 |
|
164 |
-
|
165 |
|
166 |
-
|
167 |
|
168 |
-
|
|
|
|
|
|
|
169 |
|
170 |
-
|
|
|
|
|
171 |
|
172 |
-
|
173 |
|
174 |
-
Here is a simple python exaple.
|
175 |
|
176 |
```python
|
177 |
add_tags=True #select needed
|
|
|
69 |
|
70 |
After generating some caption you can prune it right here taking into account image content. Can be used with external captions as well.
|
71 |
|
72 |
+
# **To utilize full potential you must follow prompt templates that are listed below.**
|
73 |
+
|
74 |
# Grounding
|
75 |
|
76 |
New version comes not only with improved zero-shot accuracy, but also introduces new modes for adding ground truth. You can use booru tags, some extra info, give names for characters or even describe each one to ensure right description when multiple are in frame.
|
|
|
95 |
|
96 |
Special thanks for help with dataset, code, donations: Sv1, Anonymous person, NeuroSenko.
|
97 |
|
98 |
+
Any questions or suggestions - [DISCORD](https://discord.gg/ZXHENAhqE9)
|
99 |
+
|
100 |
# Warning
|
101 |
|
102 |
Model will generate texts with adult themes if related input is provided. Outputs may be inacurate and provocative.
|
|
|
132 |
<|im_start|>assistant
|
133 |
```
|
134 |
|
135 |
+
System prompt is fixed, user prompt depends from exact mode of cationing.
|
136 |
|
137 |
+
## Prompts for modes:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
138 |
|
139 |
+
Start of userprompt for each mode:
|
140 |
|
141 |
+
* Json: `Describe the picture in structured json-like format.`
|
142 |
+
* Markdown: `Describe the picture in structured markdown format.`
|
143 |
+
* Caption variants: `Write the following options for captions: ["Regular Summary","Individual Parts","Midjourney-Style Summary","DeviantArt Commission Request"].`
|
144 |
+
* Short: `You need to write a medium-short and convenient caption for the picture.`
|
145 |
+
* Long: `You need to write a long and very detailed caption for the picture.`
|
146 |
+
* Bbox: `Write bounding boxes for each character and their faces.`
|
147 |
+
* Check and correct existing caption: `You need to compare given caption with the picture and given booru tags using chain of thought.\n1. Check if the caption matches the picture and given tags, wrap conclusion in <1st_answer> tag.\n2. Analyse if the caption mathes described characters, wrap answer in <2nd_answer> tag.\n3. In case if there are any mismatches - rewrite caption to correct it wrapping in <corrected_caption> tags. If the caption is fine - just write "no_need".`
|
|
|
|
|
148 |
|
|
|
149 |
|
150 |
+
## Prompts for grounding:
|
151 |
|
152 |
+
In case if you want to add any grounding, here are prompts for each. Multiple can be used.
|
153 |
|
154 |
+
* Booru tags: `Here are grounding tags for better understanding: <tags>BOORU_TAGS</tags>.`
|
155 |
+
* Characters: `Here is a list of characters that are present in the picture: <characters>CHARACTER_NAMES</characters>.`
|
156 |
+
* Character traits or tags: `Here are popular tags or traits for each character on the picture: <character_traits>CHARACTER1: [tag1, tag2, tag3,...]\nCHARACTER2: [...]\n...'</character_traits>.`
|
157 |
+
* Any info with natural text: `Here is preliminary information about the picture: <info>GENERAL_INFO</info>.`
|
158 |
|
159 |
+
* **Avoid using character names if no grounding is used**: `Do not use names for characters.`
|
160 |
+
|
161 |
+
## Composing userprompt:
|
162 |
|
163 |
+
After specifying selected mode, you can add prompt part for extra grounding and then privide it wrapping in corresponding xml tags.
|
164 |
|
165 |
+
Here is a simple python exaple for userprompt composing.
|
166 |
|
167 |
```python
|
168 |
add_tags=True #select needed
|