Update model card with new training details
Browse files
README.md
CHANGED
@@ -63,11 +63,49 @@ Ergo, at 1300 steps, the decision was made to cease training on the original LAI
|
|
63 |
|
64 |
This consisted of 17,800 images at a base resolution of 1024x1024, with about 700 samples in portrait and 700 samples in landscape.
|
65 |
|
66 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
67 |
|
68 |
Similar to the text encoder swap, the images showed a marked improvement over the next several checkpoints.
|
69 |
|
70 |
-
|
|
|
|
|
|
|
71 |
|
72 |
This model has been packaged up in a test form so that it can be thoroughly assessed by users.
|
73 |
|
|
|
63 |
|
64 |
This consisted of 17,800 images at a base resolution of 1024x1024, with about 700 samples in portrait and 700 samples in landscape.
|
65 |
|
66 |
+
## Contrast issues
|
67 |
+
|
68 |
+
As the checkpoint 3275 was tested, a common observation was that darker images were washed out, and brighter images seemed "meh".
|
69 |
+
|
70 |
+
Various CFG rescale and guidance levels were tested, with the best dark images occurring around `guidance_scale=9.2` and `guidance_rescale=0.0` but they remained "washed out".
|
71 |
+
|
72 |
+
## Dataset change number two
|
73 |
+
|
74 |
+
A new LAION subset was prepared with unique images and no square images - just a limited collection of aspect ratios:
|
75 |
+
|
76 |
+
* 16:9
|
77 |
+
* 9:16
|
78 |
+
* 2:3
|
79 |
+
* 3:2
|
80 |
+
|
81 |
+
This was intended to speed up the understanding of the model, and prevent overfitting on captions.
|
82 |
+
|
83 |
+
This LAION subset contained 17,800 images, evenly distributed through aspect ratios.
|
84 |
+
|
85 |
+
The images were then captioned using T5 Flan with BLIP2, to obtain highly accurate results.
|
86 |
+
|
87 |
+
## Contrast fix: offset noise / SNR gamma to the rescue?
|
88 |
+
|
89 |
+
Offset noise and SNR gamma were applied experimentally to the checkpoint **4250**:
|
90 |
+
|
91 |
+
* `snr_gamma=5.0`
|
92 |
+
* `noise_offset=0.2`
|
93 |
+
* `noise_pertubation=0.1`
|
94 |
+
|
95 |
+
Within 25 steps of training, the contrast was back, and the prompt `a solid black square` once again produced a reasonable result.
|
96 |
+
|
97 |
+
At 50 steps of offset noise, things really seemed to "click" and `a solid black square` had the fewest deformities I've seen.
|
98 |
+
|
99 |
+
Step 75 checkpoint was broken. The SNR gamma math results in numeric instability and was disabled. The offset noise parameters were untouched.
|
100 |
+
|
101 |
+
## Success! Improvement in quality and contrast.
|
102 |
|
103 |
Similar to the text encoder swap, the images showed a marked improvement over the next several checkpoints.
|
104 |
|
105 |
+
It was left to its own devices, and at step 4475, enough improvement was observed that another revision in this repository was created.
|
106 |
+
|
107 |
+
|
108 |
+
# Status: Test release
|
109 |
|
110 |
This model has been packaged up in a test form so that it can be thoroughly assessed by users.
|
111 |
|