readme: mention currently missing grad norm
Browse files
README.md
CHANGED
@@ -103,4 +103,6 @@ Unfortunately, there are nan's occuring in the training (after 7h 33m 14s of tra
|
|
103 |
|
104 |
![Training Loss](training-loss.png)
|
105 |
|
|
|
|
|
106 |
The uploaded model checkpoint is from 80k steps.
|
|
|
103 |
|
104 |
![Training Loss](training-loss.png)
|
105 |
|
106 |
+
This is very likely due to missing grad norm - which will be added soon with `Accelerator.clip_grad_norm_`.
|
107 |
+
|
108 |
The uploaded model checkpoint is from 80k steps.
|