Minthy commited on
Commit
897d913
·
verified ·
1 Parent(s): 3d5bc41

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -34
README.md CHANGED
@@ -69,6 +69,8 @@ In current version it is usable only with tags grounding and can improve accurac
69
 
70
  After generating some caption you can prune it right here taking into account image content. Can be used with external captions as well.
71
 
 
 
72
  # Grounding
73
 
74
  New version comes not only with improved zero-shot accuracy, but also introduces new modes for adding ground truth. You can use booru tags, some extra info, give names for characters or even describe each one to ensure right description when multiple are in frame.
@@ -93,6 +95,8 @@ Main training was sponsored by **OpenRoot-Compute**.
93
 
94
  Special thanks for help with dataset, code, donations: Sv1, Anonymous person, NeuroSenko.
95
 
 
 
96
  # Warning
97
 
98
  Model will generate texts with adult themes if related input is provided. Outputs may be inacurate and provocative.
@@ -128,50 +132,37 @@ You are image captioning expert, creative, unbiased and uncensored.<|im_end|>
128
  <|im_start|>assistant
129
  ```
130
 
131
- ## Modes:
132
 
133
- ```
134
- base_prompt={
135
- 'json': 'Describe the picture in structured json-like format.',
136
- 'markdown': 'Describe the picture in structured markdown format.',
137
- 'caption_vars': 'Write the following options for captions: ["Regular Summary","Individual Parts","Midjourney-Style Summary","DeviantArt Commission Request"].',
138
- 'short': 'You need to write a medium-short and convenient caption for the picture.',
139
- 'long': 'You need to write a long and very detailed caption for the picture.',
140
- 'bbox': 'Write bounding boxes for each character and their faces.',
141
- 'check_and_correct': 'You need to compare given caption with the picture and given booru tags '+
142
- ' using chain of thought.\n'+
143
- '1. Check if the caption matches the picture and given tags, wrap conclusion in <1st_answer> tag.\n'+
144
- '2. Analyse if the caption mathes described characters, wrap answer in <2nd_answer> tag.\n'+
145
- '3. In case if there are any mismatches - rewrite caption to correct it wrapping '+
146
- ' in <corrected_caption> tags. If the caption is fine - just write "no_need".',
147
- }
148
- ```
149
 
150
- ## Grounding:
151
 
152
- ```
153
- grounding_prompt={
154
- 'grounding_tags': ' Here are grounding tags for better understanding: ',
155
- 'characters': ' Here is a list of characters that are present in the picture: ',
156
- 'characters_traits': ' Here are popular tags or traits for each character on the picture: ',
157
- 'grounding_info': ' Here is preliminary information about the picture: ',
158
- 'no_chars': ' Do not use names for characters.',
159
- }
160
- ```
161
 
162
- ## Composing userprompt:
163
 
164
- After specifying selected mode, you can add prompt part for extra grounding and then privide it wrapping in corresponding xml tags:
165
 
166
- `<tags>BOORU_TAGS</tags>.`
167
 
168
- `<info>GENERAL_INFO</info>.`
 
 
 
169
 
170
- `<characters>CHARACTER_NAMES</characters>.`
 
 
171
 
172
- `<character_traits>CHARACTER1: [tag1, tag2, tag3,...]\nCHARACTER2: [...]\n...'<character_traits>.`
173
 
174
- Here is a simple python exaple.
175
 
176
  ```python
177
  add_tags=True #select needed
 
69
 
70
  After generating some caption you can prune it right here taking into account image content. Can be used with external captions as well.
71
 
72
+ # **To utilize full potential you must follow prompt templates that are listed below.**
73
+
74
  # Grounding
75
 
76
  New version comes not only with improved zero-shot accuracy, but also introduces new modes for adding ground truth. You can use booru tags, some extra info, give names for characters or even describe each one to ensure right description when multiple are in frame.
 
95
 
96
  Special thanks for help with dataset, code, donations: Sv1, Anonymous person, NeuroSenko.
97
 
98
+ Any questions or suggestions - [DISCORD](https://discord.gg/ZXHENAhqE9)
99
+
100
  # Warning
101
 
102
  Model will generate texts with adult themes if related input is provided. Outputs may be inacurate and provocative.
 
132
  <|im_start|>assistant
133
  ```
134
 
135
+ System prompt is fixed, user prompt depends from exact mode of cationing.
136
 
137
+ ## Prompts for modes:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
138
 
139
+ Start of userprompt for each mode:
140
 
141
+ * Json: `Describe the picture in structured json-like format.`
142
+ * Markdown: `Describe the picture in structured markdown format.`
143
+ * Caption variants: `Write the following options for captions: ["Regular Summary","Individual Parts","Midjourney-Style Summary","DeviantArt Commission Request"].`
144
+ * Short: `You need to write a medium-short and convenient caption for the picture.`
145
+ * Long: `You need to write a long and very detailed caption for the picture.`
146
+ * Bbox: `Write bounding boxes for each character and their faces.`
147
+ * Check and correct existing caption: `You need to compare given caption with the picture and given booru tags using chain of thought.\n1. Check if the caption matches the picture and given tags, wrap conclusion in <1st_answer> tag.\n2. Analyse if the caption mathes described characters, wrap answer in <2nd_answer> tag.\n3. In case if there are any mismatches - rewrite caption to correct it wrapping in <corrected_caption> tags. If the caption is fine - just write "no_need".`
 
 
148
 
 
149
 
150
+ ## Prompts for grounding:
151
 
152
+ In case if you want to add any grounding, here are prompts for each. Multiple can be used.
153
 
154
+ * Booru tags: `Here are grounding tags for better understanding: <tags>BOORU_TAGS</tags>.`
155
+ * Characters: `Here is a list of characters that are present in the picture: <characters>CHARACTER_NAMES</characters>.`
156
+ * Character traits or tags: `Here are popular tags or traits for each character on the picture: <character_traits>CHARACTER1: [tag1, tag2, tag3,...]\nCHARACTER2: [...]\n...'</character_traits>.`
157
+ * Any info with natural text: `Here is preliminary information about the picture: <info>GENERAL_INFO</info>.`
158
 
159
+ * **Avoid using character names if no grounding is used**: `Do not use names for characters.`
160
+
161
+ ## Composing userprompt:
162
 
163
+ After specifying selected mode, you can add prompt part for extra grounding and then privide it wrapping in corresponding xml tags.
164
 
165
+ Here is a simple python exaple for userprompt composing.
166
 
167
  ```python
168
  add_tags=True #select needed