panopstor commited on
Commit
00bfc55
1 Parent(s): 2a84d5a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -0
README.md CHANGED
@@ -1,3 +1,27 @@
1
  ---
2
  license: creativeml-openrail-m
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: creativeml-openrail-m
3
  ---
4
+
5
+ SD1.5 experiments with Huber and MSE loss. All models trained for 4 epochs on approximately 250k images from a variety of sources. Approximately half from LAION Aesthetics, and a few thousand 4K video rips with COG-VLM captions.
6
+
7
+ Trained using Everydream2 Trainer (https://github.com/victorchall/EveryDream2trainer) on an RTX 6000 Ada 48gb. Each epoch takes approximately 10 hours for a total of about 40 hours per model.
8
+
9
+ - Multi-aspect ratio trained with nominal size of <=768^2 pixels for each bucket
10
+ - Batch size 12 with grad accum 10.
11
+ - AdamW 8bit optimizer with standard betas of (0.9,0.999) and weight decay of 0.010.
12
+ - Trained with automatic mixed precision FP16 and TF32 matmul
13
+ - 3.0e-6 LR cosine schedule with a ~12 epoch target to decay, ending around 2.3e-6 at end of training
14
+ - Pyramid noise using discount 0.03
15
+ - Zero offset noise of 0.02
16
+ - Min SNR gamma of 5.0
17
+ - Unet only training, text encoder left frozen.
18
+ - Conditional dropout of 10%
19
+
20
+ The following models were produced:
21
+ - 768_huber.safetensors - Huber loss only
22
+ - 768_mse.safetensors - MSE loss only
23
+ - 768_ts0huber_ts999mse.safetensors - Huber loss at timestep 0 interpolated to MSE loss at timestep 999
24
+ - 768_ts0mse_ts999huber.safetensors - MSE loss at timestep 0 interpolated to Huber loss at timestep 999
25
+
26
+ Worth noting timestep 0 is the lowest-noise-added step and 999 is most noised timestep.
27
+