panopstor commited on
Commit
8f1e501
1 Parent(s): a351edf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -11,7 +11,8 @@ Trained using Everydream2 Trainer (https://github.com/victorchall/EveryDream2tra
11
  - Multi-aspect ratio trained with nominal size of <=768^2 pixels for each bucket
12
  - Batch size 12 with grad accum 10.
13
  - AdamW 8bit optimizer with standard betas of (0.9,0.999) and weight decay of 0.010.
14
- - Trained with automatic mixed precision FP16 and TF32 matmul
 
15
  - 3.0e-6 LR cosine schedule with a ~12 epoch target to decay, ending around 2.3e-6 at end of training
16
  - Pyramid noise using discount 0.03
17
  - Zero offset noise of 0.02
@@ -25,5 +26,5 @@ The following models were produced:
25
  - 768_ts0huber_ts999mse.safetensors - Huber loss at timestep 0 interpolated to MSE loss at timestep 999
26
  - 768_ts0mse_ts999huber.safetensors - MSE loss at timestep 0 interpolated to Huber loss at timestep 999
27
 
28
- Worth noting timestep 0 is the lowest-noise-added step and 999 is most noised timestep.
29
 
 
11
  - Multi-aspect ratio trained with nominal size of <=768^2 pixels for each bucket
12
  - Batch size 12 with grad accum 10.
13
  - AdamW 8bit optimizer with standard betas of (0.9,0.999) and weight decay of 0.010.
14
+ - Automatic mixed precision FP16 (note: grad scalar val was surprisingly identical on all runs)
15
+ - TF32 matmul and SDP Attention
16
  - 3.0e-6 LR cosine schedule with a ~12 epoch target to decay, ending around 2.3e-6 at end of training
17
  - Pyramid noise using discount 0.03
18
  - Zero offset noise of 0.02
 
26
  - 768_ts0huber_ts999mse.safetensors - Huber loss at timestep 0 interpolated to MSE loss at timestep 999
27
  - 768_ts0mse_ts999huber.safetensors - MSE loss at timestep 0 interpolated to Huber loss at timestep 999
28
 
29
+ Worth noting timestep 0 is the lowest-noise-added step and 999 is most noised timestep
30