KBlueLeaf commited on
Commit
cae9940
·
verified ·
1 Parent(s): 4d56c28

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -22
README.md CHANGED
@@ -74,28 +74,32 @@ As same as kxl eps rev2, I add realbooru and pvc figure images for more flexibil
74
 
75
  ## Training
76
  - Hardware: Quad RTX 3090s
77
- - Num Train Images: 8,468,798
78
- - Total Epoch: 1
79
- - Total Steps: 16548
80
- - Training Time: 430 hours (wall time)
81
- - Batch Size: 4
82
- - Grad Accumulation Step: 32
83
- - Equivalent Batch Size: 512
84
- - Optimizer: Lion8bit
85
- - Learning Rate: 1e-5 for UNet / TE training disabled
86
- - LR Scheduler: Constant (with warmup)
87
- - Warmup Steps: 100
88
- - Weight Decay: 0.1
89
- - Betas: 0.9, 0.95
90
- - Min SNR Gamma: 5
91
- - Debiased Estimation Loss: Enabled
92
- - IP Noise Gamma: 0.05
93
- - Resolution: 1024x1024
94
- - Min Bucket Resolution: 256
95
- - Max Bucket Resolution: 4096
96
- - Mixed Precision: FP16
97
- - Caption Tag Dropout: 0.2
98
- - Caption Group Dropout: 0.2 (for dropping tag/nl caption entirely)
 
 
 
 
99
 
100
 
101
  ## Why do you still use SDXL but not any Brand New DiT-Based Models?
 
74
 
75
  ## Training
76
  - Hardware: Quad RTX 3090s
77
+ - Training
78
+ - Num Train Images: 8,468,798
79
+ - Total Epoch: 1
80
+ - Total Steps: 16548
81
+ - Training Time: 430 hours (wall time)
82
+ - Batch Size: 4
83
+ - Grad Accumulation Step: 32
84
+ - Equivalent Batch Size: 512
85
+ - Mixed Precision: FP16
86
+ - Optimizer
87
+ - Optimizer: Lion8bit
88
+ - Learning Rate: 1e-5 for UNet / TE training disabled
89
+ - LR Scheduler: Constant (with warmup)
90
+ - Warmup Steps: 100
91
+ - Weight Decay: 0.1
92
+ - Betas: 0.9, 0.95
93
+ - Diffusion
94
+ - Min SNR Gamma: 5
95
+ - Debiased Estimation Loss: Enabled
96
+ - IP Noise Gamma: 0.05
97
+ - Resolution: 1024x1024
98
+ - Min Bucket Resolution: 256
99
+ - Max Bucket Resolution: 4096
100
+ - Other
101
+ - Caption Tag Dropout: 0.2
102
+ - Caption Group Dropout: 0.2 (for dropping tag/nl caption entirely)
103
 
104
 
105
  ## Why do you still use SDXL but not any Brand New DiT-Based Models?