bigscience-bot
commited on
Commit
•
b86e38f
1
Parent(s):
e128def
new data
Browse files- logs/main_log.txt +19 -0
logs/main_log.txt
CHANGED
@@ -99617,3 +99617,22 @@ time (ms)
|
|
99617 |
time (ms)
|
99618 |
iteration 7250/ 159576 | consumed samples: 285152 | elapsed time per iteration (ms): 19421.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.554094E+00 | loss scale: 2048.0 | grad norm: 79780.312 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
99619 |
time (ms)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
99617 |
time (ms)
|
99618 |
iteration 7250/ 159576 | consumed samples: 285152 | elapsed time per iteration (ms): 19421.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.554094E+00 | loss scale: 2048.0 | grad norm: 79780.312 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
99619 |
time (ms)
|
99620 |
+
iteration 7260/ 159576 | consumed samples: 286272 | elapsed time per iteration (ms): 19643.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.545351E+00 | loss scale: 2048.0 | grad norm: 153165.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
99621 |
+
time (ms)
|
99622 |
+
iteration 7270/ 159576 | consumed samples: 287392 | elapsed time per iteration (ms): 19873.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.548807E+00 | loss scale: 2048.0 | grad norm: 96725.418 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
99623 |
+
time (ms)
|
99624 |
+
iteration 7280/ 159576 | consumed samples: 288512 | elapsed time per iteration (ms): 19830.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.532312E+00 | loss scale: 2048.0 | grad norm: 85054.846 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
99625 |
+
time (ms)
|
99626 |
+
iteration 7290/ 159576 | consumed samples: 289632 | elapsed time per iteration (ms): 19469.1 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.535855E+00 | loss scale: 2048.0 | grad norm: 66255.480 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
99627 |
+
time (ms)
|
99628 |
+
iteration 7300/ 159576 | consumed samples: 290752 | elapsed time per iteration (ms): 19578.9 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.583752E+00 | loss scale: 2048.0 | grad norm: 61901.507 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
99629 |
+
time (ms)
|
99630 |
+
iteration 7310/ 159576 | consumed samples: 291872 | elapsed time per iteration (ms): 19646.2 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.539584E+00 | loss scale: 2048.0 | grad norm: 68238.513 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
99631 |
+
time (ms)
|
99632 |
+
iteration 7320/ 159576 | consumed samples: 292992 | elapsed time per iteration (ms): 19642.5 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.526649E+00 | loss scale: 2048.0 | grad norm: 69527.941 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
99633 |
+
time (ms)
|
99634 |
+
iteration 7330/ 159576 | consumed samples: 294112 | elapsed time per iteration (ms): 19508.3 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.514026E+00 | loss scale: 2048.0 | grad norm: 63745.755 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
99635 |
+
time (ms)
|
99636 |
+
iteration 7340/ 159576 | consumed samples: 295232 | elapsed time per iteration (ms): 19676.4 | learning rate: 6.000E-05 | global batch size: 112 | lm loss: 6.519949E+00 | loss scale: 2048.0 | grad norm: 96730.566 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
99637 |
+
time (ms)
|
99638 |
+
[2021-09-27 23:32:04] PULSE: tr8-104B is running for 5:48:38 since 2021-09-27T17:43:26 (1271196 on 'gpu_p13' partition (r7i7n[6-8],r8i0n[0-8],r8i1n[0-4],r8i7n[3-8],r9i0n[0-6,8],r9i1n[0-8],r9i2n0,r9i4n8,r9i5n[0-8],r9i6n[0-8],r9i7n[3-6])
|