RunDiffusion commited on
Commit
fdc3962
1 Parent(s): 51f2e73

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -8
README.md CHANGED
@@ -75,7 +75,7 @@ license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICE
75
 
76
  # Wonderman Proof of Concept - By RunDiffusion.com
77
 
78
- ## For this POC we needed to achieve these goals
79
  - The concept can not exist in the Flux dataset. (This is cheating)
80
  - The concept needed to be present but still allow flexibility for creativity.
81
  - The concept needed to resemble the subject within 90% accuracy.
@@ -89,11 +89,12 @@ Flux thinks that "Wonderman" is "Superman"
89
  ![Flux thinks that "Wonderman" is "Superman"](Huggingface-assets/superman-flux.jpg)
90
 
91
 
92
- ## Data Used for Training
93
  You can view the [RAW low quality data here: ](https://huggingface.co/RunDiffusion/Wonderman-Flux-POC/tree/main/Raw%20Low%20Quality%20Data)
94
  The training data was low resolution, cropped, oddly shaped, pixelated, and overall the worst possible data we've come across. That didn't stop us! AI to the rescue!
95
  ![Low Quality Training Data](Huggingface-assets/multiple-samples-training-data.png)
96
 
 
97
  To fix the data we had to:
98
  - Inpaint problem areas like backgrounds, signatures, and text
99
  - Outpaint to expand images
@@ -104,7 +105,7 @@ We were able to get the dataset to 13 with these techniques.
104
  Full dataset [is here](https://huggingface.co/RunDiffusion/Wonderman-Flux-POC/tree/main/Cleaned%20and%20Captioned%20Data)
105
  ![Cleaned Wonderman Dataset](Huggingface-assets/multiple-samples-of-cleaned-data.png)
106
 
107
- ### Captioning the Data
108
  We are not entirely familiar with Flux's preferred captioning style. We understand that this model responds will to full descriptive sentences so we went with that. Below are some examples of the images with their captions. We chose LLaMA v3 inspired by this paper: https://arxiv.org/html/2406.08478v1
109
  The system prompt used was basic and could likely benefit from further refinement.
110
 
@@ -114,7 +115,7 @@ A vintage comic book cover of Wonderman. On the cover, there are three main char
114
  Wonderman, a male superhero character. He is wearing a green and red costume with a large 'W' emblem on the chest. Wonderman has a muscular physique, brown hair, and is wearing a black mask covering his eyes. He stands confidently with his hands by his sides. photo
115
  ![Standing Wonderman](Cleaned%20and%20Captioned%20Data/00002.png)
116
 
117
- ### Train the Data
118
  All tasks were performed on a local workstation equipped with an RTX 4090, i7 processor, and 64GB RAM. Note that 32GB RAM will not suffice, as you may encounter out-of-memory (OOM) errors when caching latents. We did use RunDiffusion.com for testing the LoRAs created, enabling us to launch five servers with five checkpoints to determine the best one that converged
119
  We're not going to dive into the rank and learning rate and stuff because this really depends on your goals and what you're trying to accomplish. But the rules below are good ones to follow.
120
  - We used Ostris's ai-toolkit available here: [Ostris ai-toolkit](https://github.com/ostris/ai-toolkit/tree/main)
@@ -130,22 +131,22 @@ You'll see in the next page of examples where the captioning really helps or hur
130
  Total time for the LoRA was about 2 to 2.5 hours. $1 to $2 on RunPod, Vast, or local electricity will be even cheaper.
131
  Now for the results! (This next file is big to preserve the quality)
132
 
133
- # 500 Steps
134
  Right off the bat at 500 steps you will get some likeness. This will mostly be baseline Flux. If you're training a concept that exists then you will see some convergence even at just 500 steps.
135
  ![500 steps](Huggingface-assets/500-steps.jpg)
136
  Prompt: a vintage comic book cover for Wonderman, featuring three characters in a dynamic action scene. The central figure is Wonderman with a confident expression, wearing a green shirt with a yellow belt and red gloves. To his left is a woman with a look of concern, dressed in a yellow top and red skirt. On the right, there's a monstrous creature with sharp teeth and claws, seemingly attacking the man. The background is minimal, primarily blue with a hint of landscape at the bottom. The text WONDER COMICS and No. 11 suggests this is from a series.
137
 
138
- # 1250 Steps
139
  It will start to break apart a little bit here. Be patient. It's learning.
140
  ![1250 steps](Huggingface-assets/1250-steps.jpg)
141
  Prompt: A vintage comic book cover titled 'Wonderman Comics'. The central figure is Wonderman who appears to be in a combat stance. He is lunging at a large, menacing creature with a gaping mouth, revealing sharp teeth. Below the main characters, there's a woman in a yellow dress holding a small device, possibly a gun. She seems to be in distress. In the background, there's a futuristic-looking tower with a few figures standing atop. The overall color palette is vibrant, with dominant yellows, greens, and purples.
142
 
143
- # 1750 Steps
144
  Hey! We're getting somewhere! The caption as a prompt should be showing our subject well at this stage but the real test is breaking away from the caption to see if our subject is present.
145
  ![1750 steps](Huggingface-assets/1750-steps.jpg)
146
  Prompt: Wonderman wearing a green and red costume with a large 'W' emblem on the chest standing heroically
147
 
148
- # 2500 Steps
149
  There he is! We can now prompt more freely to get Wonderman doing other stuff. Keep in mind we will still be limited to what we trained on, but at least we have a great starting point!
150
  ![2500 steps](Huggingface-assets/2500-steps.jpg)
151
  Prompt: comic style illustration of Wonderman running from aliens on the moon. center character is Wonderman, a male superhero character. He is wearing a green and red costume with a large 'W' emblem on the chest. Black boots to his knees. Wonderman is wearing a black mask covering his eyes
 
75
 
76
  # Wonderman Proof of Concept - By RunDiffusion.com
77
 
78
+ # For this POC we needed to achieve these goals
79
  - The concept can not exist in the Flux dataset. (This is cheating)
80
  - The concept needed to be present but still allow flexibility for creativity.
81
  - The concept needed to resemble the subject within 90% accuracy.
 
89
  ![Flux thinks that "Wonderman" is "Superman"](Huggingface-assets/superman-flux.jpg)
90
 
91
 
92
+ # Data Used for Training
93
  You can view the [RAW low quality data here: ](https://huggingface.co/RunDiffusion/Wonderman-Flux-POC/tree/main/Raw%20Low%20Quality%20Data)
94
  The training data was low resolution, cropped, oddly shaped, pixelated, and overall the worst possible data we've come across. That didn't stop us! AI to the rescue!
95
  ![Low Quality Training Data](Huggingface-assets/multiple-samples-training-data.png)
96
 
97
+ ## Data Cleanup Strategy
98
  To fix the data we had to:
99
  - Inpaint problem areas like backgrounds, signatures, and text
100
  - Outpaint to expand images
 
105
  Full dataset [is here](https://huggingface.co/RunDiffusion/Wonderman-Flux-POC/tree/main/Cleaned%20and%20Captioned%20Data)
106
  ![Cleaned Wonderman Dataset](Huggingface-assets/multiple-samples-of-cleaned-data.png)
107
 
108
+ # Captioning the Data
109
  We are not entirely familiar with Flux's preferred captioning style. We understand that this model responds will to full descriptive sentences so we went with that. Below are some examples of the images with their captions. We chose LLaMA v3 inspired by this paper: https://arxiv.org/html/2406.08478v1
110
  The system prompt used was basic and could likely benefit from further refinement.
111
 
 
115
  Wonderman, a male superhero character. He is wearing a green and red costume with a large 'W' emblem on the chest. Wonderman has a muscular physique, brown hair, and is wearing a black mask covering his eyes. He stands confidently with his hands by his sides. photo
116
  ![Standing Wonderman](Cleaned%20and%20Captioned%20Data/00002.png)
117
 
118
+ # Train the Data
119
  All tasks were performed on a local workstation equipped with an RTX 4090, i7 processor, and 64GB RAM. Note that 32GB RAM will not suffice, as you may encounter out-of-memory (OOM) errors when caching latents. We did use RunDiffusion.com for testing the LoRAs created, enabling us to launch five servers with five checkpoints to determine the best one that converged
120
  We're not going to dive into the rank and learning rate and stuff because this really depends on your goals and what you're trying to accomplish. But the rules below are good ones to follow.
121
  - We used Ostris's ai-toolkit available here: [Ostris ai-toolkit](https://github.com/ostris/ai-toolkit/tree/main)
 
131
  Total time for the LoRA was about 2 to 2.5 hours. $1 to $2 on RunPod, Vast, or local electricity will be even cheaper.
132
  Now for the results! (This next file is big to preserve the quality)
133
 
134
+ ## 500 Steps
135
  Right off the bat at 500 steps you will get some likeness. This will mostly be baseline Flux. If you're training a concept that exists then you will see some convergence even at just 500 steps.
136
  ![500 steps](Huggingface-assets/500-steps.jpg)
137
  Prompt: a vintage comic book cover for Wonderman, featuring three characters in a dynamic action scene. The central figure is Wonderman with a confident expression, wearing a green shirt with a yellow belt and red gloves. To his left is a woman with a look of concern, dressed in a yellow top and red skirt. On the right, there's a monstrous creature with sharp teeth and claws, seemingly attacking the man. The background is minimal, primarily blue with a hint of landscape at the bottom. The text WONDER COMICS and No. 11 suggests this is from a series.
138
 
139
+ ## 1250 Steps
140
  It will start to break apart a little bit here. Be patient. It's learning.
141
  ![1250 steps](Huggingface-assets/1250-steps.jpg)
142
  Prompt: A vintage comic book cover titled 'Wonderman Comics'. The central figure is Wonderman who appears to be in a combat stance. He is lunging at a large, menacing creature with a gaping mouth, revealing sharp teeth. Below the main characters, there's a woman in a yellow dress holding a small device, possibly a gun. She seems to be in distress. In the background, there's a futuristic-looking tower with a few figures standing atop. The overall color palette is vibrant, with dominant yellows, greens, and purples.
143
 
144
+ ## 1750 Steps
145
  Hey! We're getting somewhere! The caption as a prompt should be showing our subject well at this stage but the real test is breaking away from the caption to see if our subject is present.
146
  ![1750 steps](Huggingface-assets/1750-steps.jpg)
147
  Prompt: Wonderman wearing a green and red costume with a large 'W' emblem on the chest standing heroically
148
 
149
+ ## 2500 Steps
150
  There he is! We can now prompt more freely to get Wonderman doing other stuff. Keep in mind we will still be limited to what we trained on, but at least we have a great starting point!
151
  ![2500 steps](Huggingface-assets/2500-steps.jpg)
152
  Prompt: comic style illustration of Wonderman running from aliens on the moon. center character is Wonderman, a male superhero character. He is wearing a green and red costume with a large 'W' emblem on the chest. Black boots to his knees. Wonderman is wearing a black mask covering his eyes