metadata
license: cc-by-nc-4.0
pipeline: null
sd 1.5 fine-tuned on 131000 high-quality captioned image pairs generated from dalle3 on 4 3090s with nvlink for 16hrs for 8 epochs.
it seems to be good at people, hands, and text but not animals.
unique examples: 13100
num examples: 131000
num epochs: 8
num examples: 31000
total train batch size: 40
gradient accumulation = 1
total optimization steps: 26200