bigscience-bot
commited on
Commit
•
d0b4f1a
1
Parent(s):
adb6c42
new data
Browse files- logs/main_log.txt +458 -0
logs/main_log.txt
CHANGED
@@ -42181,3 +42181,461 @@ time (ms)
|
|
42181 |
time (ms)
|
42182 |
[2021-09-24 23:07:37] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition)
|
42183 |
[2021-09-24 23:07:37] PULSE: tr8-104B is running for 17:15:26 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8])
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42181 |
time (ms)
|
42182 |
[2021-09-24 23:07:37] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition)
|
42183 |
[2021-09-24 23:07:37] PULSE: tr8-104B is running for 17:15:26 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8])
|
42184 |
+
iteration 5253/ 159576 | consumed samples: 134016 | elapsed time per iteration (ms): 15553.2 | learning rate: 3.709E-05 | global batch size: 48 | lm loss: 6.395989E+00 | loss scale: 4096.0 | grad norm: 75934.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42185 |
+
time (ms)
|
42186 |
+
iteration 5254/ 159576 | consumed samples: 134064 | elapsed time per iteration (ms): 15521.6 | learning rate: 3.710E-05 | global batch size: 48 | lm loss: 6.388237E+00 | loss scale: 4096.0 | grad norm: 85225.047 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42187 |
+
time (ms)
|
42188 |
+
iteration 5255/ 159576 | consumed samples: 134112 | elapsed time per iteration (ms): 15886.3 | learning rate: 3.711E-05 | global batch size: 48 | lm loss: 6.348703E+00 | loss scale: 4096.0 | grad norm: 72802.836 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42189 |
+
time (ms)
|
42190 |
+
iteration 5256/ 159576 | consumed samples: 134160 | elapsed time per iteration (ms): 15520.3 | learning rate: 3.713E-05 | global batch size: 48 | lm loss: 6.321572E+00 | loss scale: 4096.0 | grad norm: 73245.874 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42191 |
+
time (ms)
|
42192 |
+
iteration 5257/ 159576 | consumed samples: 134208 | elapsed time per iteration (ms): 15443.7 | learning rate: 3.714E-05 | global batch size: 48 | lm loss: 6.335665E+00 | loss scale: 4096.0 | grad norm: 58798.760 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42193 |
+
time (ms)
|
42194 |
+
iteration 5258/ 159576 | consumed samples: 134256 | elapsed time per iteration (ms): 15427.0 | learning rate: 3.715E-05 | global batch size: 48 | lm loss: 6.319070E+00 | loss scale: 4096.0 | grad norm: 66591.391 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42195 |
+
time (ms)
|
42196 |
+
iteration 5259/ 159576 | consumed samples: 134304 | elapsed time per iteration (ms): 15760.6 | learning rate: 3.717E-05 | global batch size: 48 | lm loss: 6.229961E+00 | loss scale: 4096.0 | grad norm: 78411.623 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42197 |
+
time (ms)
|
42198 |
+
iteration 5260/ 159576 | consumed samples: 134352 | elapsed time per iteration (ms): 15544.0 | learning rate: 3.718E-05 | global batch size: 48 | lm loss: 6.379896E+00 | loss scale: 4096.0 | grad norm: 82294.960 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42199 |
+
time (ms)
|
42200 |
+
iteration 5261/ 159576 | consumed samples: 134400 | elapsed time per iteration (ms): 15397.8 | learning rate: 3.719E-05 | global batch size: 48 | lm loss: 6.233184E+00 | loss scale: 4096.0 | grad norm: 65525.586 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42201 |
+
time (ms)
|
42202 |
+
iteration 5262/ 159576 | consumed samples: 134448 | elapsed time per iteration (ms): 15498.3 | learning rate: 3.721E-05 | global batch size: 48 | lm loss: 6.326461E+00 | loss scale: 4096.0 | grad norm: 101232.286 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42203 |
+
time (ms)
|
42204 |
+
iteration 5263/ 159576 | consumed samples: 134496 | elapsed time per iteration (ms): 15834.8 | learning rate: 3.722E-05 | global batch size: 48 | lm loss: 6.351873E+00 | loss scale: 4096.0 | grad norm: 82652.498 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42205 |
+
time (ms)
|
42206 |
+
iteration 5264/ 159576 | consumed samples: 134544 | elapsed time per iteration (ms): 15450.4 | learning rate: 3.723E-05 | global batch size: 48 | lm loss: 6.411518E+00 | loss scale: 4096.0 | grad norm: 79704.233 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42207 |
+
time (ms)
|
42208 |
+
iteration 5265/ 159576 | consumed samples: 134592 | elapsed time per iteration (ms): 15408.5 | learning rate: 3.725E-05 | global batch size: 48 | lm loss: 6.324855E+00 | loss scale: 4096.0 | grad norm: 96783.723 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42209 |
+
time (ms)
|
42210 |
+
iteration 5266/ 159576 | consumed samples: 134640 | elapsed time per iteration (ms): 15369.4 | learning rate: 3.726E-05 | global batch size: 48 | lm loss: 6.351592E+00 | loss scale: 4096.0 | grad norm: 96231.447 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42211 |
+
time (ms)
|
42212 |
+
iteration 5267/ 159576 | consumed samples: 134688 | elapsed time per iteration (ms): 15643.8 | learning rate: 3.727E-05 | global batch size: 48 | lm loss: 6.439371E+00 | loss scale: 4096.0 | grad norm: 86165.942 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42213 |
+
time (ms)
|
42214 |
+
iteration 5268/ 159576 | consumed samples: 134736 | elapsed time per iteration (ms): 15428.0 | learning rate: 3.729E-05 | global batch size: 48 | lm loss: 6.282881E+00 | loss scale: 4096.0 | grad norm: 95370.085 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42215 |
+
time (ms)
|
42216 |
+
iteration 5269/ 159576 | consumed samples: 134784 | elapsed time per iteration (ms): 15422.7 | learning rate: 3.730E-05 | global batch size: 48 | lm loss: 6.489480E+00 | loss scale: 4096.0 | grad norm: 77407.640 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42217 |
+
time (ms)
|
42218 |
+
iteration 5270/ 159576 | consumed samples: 134832 | elapsed time per iteration (ms): 15384.0 | learning rate: 3.731E-05 | global batch size: 48 | lm loss: 6.382200E+00 | loss scale: 4096.0 | grad norm: 66716.315 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42219 |
+
time (ms)
|
42220 |
+
iteration 5271/ 159576 | consumed samples: 134880 | elapsed time per iteration (ms): 15581.8 | learning rate: 3.733E-05 | global batch size: 48 | lm loss: 6.409722E+00 | loss scale: 4096.0 | grad norm: 68218.526 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42221 |
+
time (ms)
|
42222 |
+
iteration 5272/ 159576 | consumed samples: 134928 | elapsed time per iteration (ms): 15395.7 | learning rate: 3.734E-05 | global batch size: 48 | lm loss: 6.493249E+00 | loss scale: 4096.0 | grad norm: 71580.496 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42223 |
+
time (ms)
|
42224 |
+
iteration 5273/ 159576 | consumed samples: 134976 | elapsed time per iteration (ms): 15402.4 | learning rate: 3.735E-05 | global batch size: 48 | lm loss: 6.376624E+00 | loss scale: 4096.0 | grad norm: 85075.910 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42225 |
+
time (ms)
|
42226 |
+
iteration 5274/ 159576 | consumed samples: 135024 | elapsed time per iteration (ms): 15424.2 | learning rate: 3.737E-05 | global batch size: 48 | lm loss: 6.441435E+00 | loss scale: 4096.0 | grad norm: 75286.225 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42227 |
+
time (ms)
|
42228 |
+
iteration 5275/ 159576 | consumed samples: 135072 | elapsed time per iteration (ms): 15616.5 | learning rate: 3.738E-05 | global batch size: 48 | lm loss: 6.428281E+00 | loss scale: 4096.0 | grad norm: 71317.497 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42229 |
+
time (ms)
|
42230 |
+
iteration 5276/ 159576 | consumed samples: 135120 | elapsed time per iteration (ms): 15383.8 | learning rate: 3.739E-05 | global batch size: 48 | lm loss: 6.324539E+00 | loss scale: 4096.0 | grad norm: 70509.208 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42231 |
+
time (ms)
|
42232 |
+
iteration 5277/ 159576 | consumed samples: 135168 | elapsed time per iteration (ms): 15404.4 | learning rate: 3.741E-05 | global batch size: 48 | lm loss: 6.396560E+00 | loss scale: 4096.0 | grad norm: 68223.773 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42233 |
+
time (ms)
|
42234 |
+
iteration 5278/ 159576 | consumed samples: 135216 | elapsed time per iteration (ms): 15464.0 | learning rate: 3.742E-05 | global batch size: 48 | lm loss: 6.403405E+00 | loss scale: 4096.0 | grad norm: 74828.040 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42235 |
+
time (ms)
|
42236 |
+
iteration 5279/ 159576 | consumed samples: 135264 | elapsed time per iteration (ms): 15572.0 | learning rate: 3.743E-05 | global batch size: 48 | lm loss: 6.340907E+00 | loss scale: 4096.0 | grad norm: 103719.466 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42237 |
+
time (ms)
|
42238 |
+
iteration 5280/ 159576 | consumed samples: 135312 | elapsed time per iteration (ms): 15390.1 | learning rate: 3.745E-05 | global batch size: 48 | lm loss: 6.465801E+00 | loss scale: 4096.0 | grad norm: 71954.053 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42239 |
+
time (ms)
|
42240 |
+
iteration 5281/ 159576 | consumed samples: 135360 | elapsed time per iteration (ms): 15379.3 | learning rate: 3.746E-05 | global batch size: 48 | lm loss: 6.481463E+00 | loss scale: 4096.0 | grad norm: 64156.580 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42241 |
+
time (ms)
|
42242 |
+
iteration 5282/ 159576 | consumed samples: 135408 | elapsed time per iteration (ms): 15880.0 | learning rate: 3.747E-05 | global batch size: 48 | lm loss: 6.324627E+00 | loss scale: 4096.0 | grad norm: 77974.806 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42243 |
+
time (ms)
|
42244 |
+
iteration 5283/ 159576 | consumed samples: 135456 | elapsed time per iteration (ms): 15461.2 | learning rate: 3.749E-05 | global batch size: 48 | lm loss: 6.278036E+00 | loss scale: 4096.0 | grad norm: 78417.449 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42245 |
+
time (ms)
|
42246 |
+
iteration 5284/ 159576 | consumed samples: 135504 | elapsed time per iteration (ms): 15434.3 | learning rate: 3.750E-05 | global batch size: 48 | lm loss: 6.470399E+00 | loss scale: 4096.0 | grad norm: 70677.576 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42247 |
+
time (ms)
|
42248 |
+
iteration 5285/ 159576 | consumed samples: 135552 | elapsed time per iteration (ms): 15453.3 | learning rate: 3.751E-05 | global batch size: 48 | lm loss: 6.465354E+00 | loss scale: 4096.0 | grad norm: 72699.042 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42249 |
+
time (ms)
|
42250 |
+
iteration 5286/ 159576 | consumed samples: 135600 | elapsed time per iteration (ms): 15799.4 | learning rate: 3.753E-05 | global batch size: 48 | lm loss: 6.366466E+00 | loss scale: 4096.0 | grad norm: 87890.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42251 |
+
time (ms)
|
42252 |
+
iteration 5287/ 159576 | consumed samples: 135648 | elapsed time per iteration (ms): 15462.6 | learning rate: 3.754E-05 | global batch size: 48 | lm loss: 6.450302E+00 | loss scale: 4096.0 | grad norm: 65500.276 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42253 |
+
time (ms)
|
42254 |
+
iteration 5288/ 159576 | consumed samples: 135696 | elapsed time per iteration (ms): 15449.3 | learning rate: 3.755E-05 | global batch size: 48 | lm loss: 6.211058E+00 | loss scale: 4096.0 | grad norm: 91309.432 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42255 |
+
time (ms)
|
42256 |
+
iteration 5289/ 159576 | consumed samples: 135744 | elapsed time per iteration (ms): 15440.0 | learning rate: 3.757E-05 | global batch size: 48 | lm loss: 6.439297E+00 | loss scale: 4096.0 | grad norm: 78139.415 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42257 |
+
time (ms)
|
42258 |
+
iteration 5290/ 159576 | consumed samples: 135792 | elapsed time per iteration (ms): 15759.6 | learning rate: 3.758E-05 | global batch size: 48 | lm loss: 6.295393E+00 | loss scale: 4096.0 | grad norm: 67343.216 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42259 |
+
time (ms)
|
42260 |
+
iteration 5291/ 159576 | consumed samples: 135840 | elapsed time per iteration (ms): 15513.6 | learning rate: 3.759E-05 | global batch size: 48 | lm loss: 6.403075E+00 | loss scale: 4096.0 | grad norm: 88227.795 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42261 |
+
time (ms)
|
42262 |
+
iteration 5292/ 159576 | consumed samples: 135888 | elapsed time per iteration (ms): 15421.3 | learning rate: 3.761E-05 | global batch size: 48 | lm loss: 6.414333E+00 | loss scale: 4096.0 | grad norm: 78788.254 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42263 |
+
time (ms)
|
42264 |
+
iteration 5293/ 159576 | consumed samples: 135936 | elapsed time per iteration (ms): 15345.3 | learning rate: 3.762E-05 | global batch size: 48 | lm loss: 6.292488E+00 | loss scale: 4096.0 | grad norm: 59708.880 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42265 |
+
time (ms)
|
42266 |
+
iteration 5294/ 159576 | consumed samples: 135984 | elapsed time per iteration (ms): 16027.7 | learning rate: 3.763E-05 | global batch size: 48 | lm loss: 6.385753E+00 | loss scale: 4096.0 | grad norm: 102775.204 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42267 |
+
time (ms)
|
42268 |
+
iteration 5295/ 159576 | consumed samples: 136032 | elapsed time per iteration (ms): 15461.5 | learning rate: 3.765E-05 | global batch size: 48 | lm loss: 6.324437E+00 | loss scale: 4096.0 | grad norm: 71697.534 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42269 |
+
time (ms)
|
42270 |
+
iteration 5296/ 159576 | consumed samples: 136080 | elapsed time per iteration (ms): 15433.9 | learning rate: 3.766E-05 | global batch size: 48 | lm loss: 6.384956E+00 | loss scale: 4096.0 | grad norm: 102953.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42271 |
+
time (ms)
|
42272 |
+
iteration 5297/ 159576 | consumed samples: 136128 | elapsed time per iteration (ms): 15429.7 | learning rate: 3.767E-05 | global batch size: 48 | lm loss: 6.436825E+00 | loss scale: 4096.0 | grad norm: 75031.086 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42273 |
+
time (ms)
|
42274 |
+
iteration 5298/ 159576 | consumed samples: 136176 | elapsed time per iteration (ms): 15818.4 | learning rate: 3.769E-05 | global batch size: 48 | lm loss: 6.482272E+00 | loss scale: 4096.0 | grad norm: 65276.986 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42275 |
+
time (ms)
|
42276 |
+
iteration 5299/ 159576 | consumed samples: 136224 | elapsed time per iteration (ms): 15441.5 | learning rate: 3.770E-05 | global batch size: 48 | lm loss: 6.589076E+00 | loss scale: 4096.0 | grad norm: 121561.959 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42277 |
+
time (ms)
|
42278 |
+
iteration 5300/ 159576 | consumed samples: 136272 | elapsed time per iteration (ms): 15422.2 | learning rate: 3.771E-05 | global batch size: 48 | lm loss: 6.405668E+00 | loss scale: 4096.0 | grad norm: 62093.972 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42279 |
+
time (ms)
|
42280 |
+
iteration 5301/ 159576 | consumed samples: 136320 | elapsed time per iteration (ms): 15355.0 | learning rate: 3.773E-05 | global batch size: 48 | lm loss: 6.390646E+00 | loss scale: 4096.0 | grad norm: 56038.998 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42281 |
+
time (ms)
|
42282 |
+
iteration 5302/ 159576 | consumed samples: 136368 | elapsed time per iteration (ms): 15565.3 | learning rate: 3.774E-05 | global batch size: 48 | lm loss: 6.410752E+00 | loss scale: 4096.0 | grad norm: 64581.105 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42283 |
+
time (ms)
|
42284 |
+
iteration 5303/ 159576 | consumed samples: 136416 | elapsed time per iteration (ms): 15422.3 | learning rate: 3.775E-05 | global batch size: 48 | lm loss: 6.448494E+00 | loss scale: 4096.0 | grad norm: 77740.769 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42285 |
+
time (ms)
|
42286 |
+
iteration 5304/ 159576 | consumed samples: 136464 | elapsed time per iteration (ms): 15454.6 | learning rate: 3.777E-05 | global batch size: 48 | lm loss: 6.436998E+00 | loss scale: 4096.0 | grad norm: 86587.477 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42287 |
+
time (ms)
|
42288 |
+
iteration 5305/ 159576 | consumed samples: 136512 | elapsed time per iteration (ms): 15410.7 | learning rate: 3.778E-05 | global batch size: 48 | lm loss: 6.360906E+00 | loss scale: 4096.0 | grad norm: 102483.307 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42289 |
+
time (ms)
|
42290 |
+
iteration 5306/ 159576 | consumed samples: 136560 | elapsed time per iteration (ms): 15590.5 | learning rate: 3.779E-05 | global batch size: 48 | lm loss: 6.449046E+00 | loss scale: 4096.0 | grad norm: 63898.529 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42291 |
+
time (ms)
|
42292 |
+
iteration 5307/ 159576 | consumed samples: 136608 | elapsed time per iteration (ms): 15506.8 | learning rate: 3.781E-05 | global batch size: 48 | lm loss: 6.467348E+00 | loss scale: 4096.0 | grad norm: 66863.281 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42293 |
+
time (ms)
|
42294 |
+
iteration 5308/ 159576 | consumed samples: 136656 | elapsed time per iteration (ms): 15351.0 | learning rate: 3.782E-05 | global batch size: 48 | lm loss: 6.301440E+00 | loss scale: 4096.0 | grad norm: 66038.590 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42295 |
+
time (ms)
|
42296 |
+
iteration 5309/ 159576 | consumed samples: 136704 | elapsed time per iteration (ms): 15547.1 | learning rate: 3.783E-05 | global batch size: 48 | lm loss: 6.314401E+00 | loss scale: 4096.0 | grad norm: 100622.046 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42297 |
+
time (ms)
|
42298 |
+
iteration 5310/ 159576 | consumed samples: 136752 | elapsed time per iteration (ms): 15714.1 | learning rate: 3.785E-05 | global batch size: 48 | lm loss: 6.474138E+00 | loss scale: 4096.0 | grad norm: 100713.919 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42299 |
+
time (ms)
|
42300 |
+
iteration 5311/ 159576 | consumed samples: 136800 | elapsed time per iteration (ms): 15441.4 | learning rate: 3.786E-05 | global batch size: 48 | lm loss: 6.429978E+00 | loss scale: 4096.0 | grad norm: 73118.420 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42301 |
+
time (ms)
|
42302 |
+
iteration 5312/ 159576 | consumed samples: 136848 | elapsed time per iteration (ms): 15448.2 | learning rate: 3.787E-05 | global batch size: 48 | lm loss: 6.322928E+00 | loss scale: 4096.0 | grad norm: 79244.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42303 |
+
time (ms)
|
42304 |
+
iteration 5313/ 159576 | consumed samples: 136896 | elapsed time per iteration (ms): 15801.3 | learning rate: 3.789E-05 | global batch size: 48 | lm loss: 6.536728E+00 | loss scale: 4096.0 | grad norm: 80004.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42305 |
+
time (ms)
|
42306 |
+
iteration 5314/ 159576 | consumed samples: 136944 | elapsed time per iteration (ms): 15420.7 | learning rate: 3.790E-05 | global batch size: 48 | lm loss: 6.358313E+00 | loss scale: 4096.0 | grad norm: 73656.992 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42307 |
+
time (ms)
|
42308 |
+
iteration 5315/ 159576 | consumed samples: 136992 | elapsed time per iteration (ms): 15430.5 | learning rate: 3.791E-05 | global batch size: 48 | lm loss: 6.285139E+00 | loss scale: 4096.0 | grad norm: 72555.490 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42309 |
+
time (ms)
|
42310 |
+
iteration 5316/ 159576 | consumed samples: 137040 | elapsed time per iteration (ms): 15418.3 | learning rate: 3.793E-05 | global batch size: 48 | lm loss: 6.355993E+00 | loss scale: 4096.0 | grad norm: 89604.868 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42311 |
+
time (ms)
|
42312 |
+
iteration 5317/ 159576 | consumed samples: 137088 | elapsed time per iteration (ms): 15767.6 | learning rate: 3.794E-05 | global batch size: 48 | lm loss: 6.370296E+00 | loss scale: 4096.0 | grad norm: 68760.061 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42313 |
+
time (ms)
|
42314 |
+
iteration 5318/ 159576 | consumed samples: 137136 | elapsed time per iteration (ms): 15469.0 | learning rate: 3.795E-05 | global batch size: 48 | lm loss: 6.401207E+00 | loss scale: 4096.0 | grad norm: 64825.425 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42315 |
+
time (ms)
|
42316 |
+
iteration 5319/ 159576 | consumed samples: 137184 | elapsed time per iteration (ms): 15469.4 | learning rate: 3.797E-05 | global batch size: 48 | lm loss: 6.433188E+00 | loss scale: 4096.0 | grad norm: 75954.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42317 |
+
time (ms)
|
42318 |
+
iteration 5320/ 159576 | consumed samples: 137232 | elapsed time per iteration (ms): 15484.0 | learning rate: 3.798E-05 | global batch size: 48 | lm loss: 6.422481E+00 | loss scale: 4096.0 | grad norm: 85143.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42319 |
+
time (ms)
|
42320 |
+
iteration 5321/ 159576 | consumed samples: 137280 | elapsed time per iteration (ms): 15773.2 | learning rate: 3.799E-05 | global batch size: 48 | lm loss: 6.394318E+00 | loss scale: 4096.0 | grad norm: 81431.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42321 |
+
time (ms)
|
42322 |
+
iteration 5322/ 159576 | consumed samples: 137328 | elapsed time per iteration (ms): 15339.5 | learning rate: 3.801E-05 | global batch size: 48 | lm loss: 6.498918E+00 | loss scale: 4096.0 | grad norm: 76418.870 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42323 |
+
time (ms)
|
42324 |
+
iteration 5323/ 159576 | consumed samples: 137376 | elapsed time per iteration (ms): 15420.7 | learning rate: 3.802E-05 | global batch size: 48 | lm loss: 6.518599E+00 | loss scale: 4096.0 | grad norm: 71705.255 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42325 |
+
time (ms)
|
42326 |
+
iteration 5324/ 159576 | consumed samples: 137424 | elapsed time per iteration (ms): 15420.3 | learning rate: 3.803E-05 | global batch size: 48 | lm loss: 6.429631E+00 | loss scale: 4096.0 | grad norm: 57358.188 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42327 |
+
time (ms)
|
42328 |
+
iteration 5325/ 159576 | consumed samples: 137472 | elapsed time per iteration (ms): 15903.1 | learning rate: 3.805E-05 | global batch size: 48 | lm loss: 6.407781E+00 | loss scale: 4096.0 | grad norm: 91506.505 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42329 |
+
time (ms)
|
42330 |
+
iteration 5326/ 159576 | consumed samples: 137520 | elapsed time per iteration (ms): 15425.4 | learning rate: 3.806E-05 | global batch size: 48 | lm loss: 6.399868E+00 | loss scale: 4096.0 | grad norm: 68843.352 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42331 |
+
time (ms)
|
42332 |
+
iteration 5327/ 159576 | consumed samples: 137568 | elapsed time per iteration (ms): 15444.3 | learning rate: 3.807E-05 | global batch size: 48 | lm loss: 6.412372E+00 | loss scale: 4096.0 | grad norm: 67149.711 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42333 |
+
time (ms)
|
42334 |
+
iteration 5328/ 159576 | consumed samples: 137616 | elapsed time per iteration (ms): 15406.6 | learning rate: 3.809E-05 | global batch size: 48 | lm loss: 6.430699E+00 | loss scale: 4096.0 | grad norm: 102742.719 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42335 |
+
time (ms)
|
42336 |
+
iteration 5329/ 159576 | consumed samples: 137664 | elapsed time per iteration (ms): 15722.7 | learning rate: 3.810E-05 | global batch size: 48 | lm loss: 6.415520E+00 | loss scale: 4096.0 | grad norm: 73301.472 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42337 |
+
time (ms)
|
42338 |
+
iteration 5330/ 159576 | consumed samples: 137712 | elapsed time per iteration (ms): 15405.0 | learning rate: 3.811E-05 | global batch size: 48 | lm loss: 6.359590E+00 | loss scale: 4096.0 | grad norm: 70222.523 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42339 |
+
time (ms)
|
42340 |
+
iteration 5331/ 159576 | consumed samples: 137760 | elapsed time per iteration (ms): 15374.6 | learning rate: 3.813E-05 | global batch size: 48 | lm loss: 6.443409E+00 | loss scale: 4096.0 | grad norm: 79619.657 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42341 |
+
time (ms)
|
42342 |
+
iteration 5332/ 159576 | consumed samples: 137808 | elapsed time per iteration (ms): 15404.3 | learning rate: 3.814E-05 | global batch size: 48 | lm loss: 6.412749E+00 | loss scale: 4096.0 | grad norm: 110889.514 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42343 |
+
time (ms)
|
42344 |
+
iteration 5333/ 159576 | consumed samples: 137856 | elapsed time per iteration (ms): 15590.4 | learning rate: 3.815E-05 | global batch size: 48 | lm loss: 6.492513E+00 | loss scale: 4096.0 | grad norm: 80255.448 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42345 |
+
time (ms)
|
42346 |
+
iteration 5334/ 159576 | consumed samples: 137904 | elapsed time per iteration (ms): 15436.5 | learning rate: 3.817E-05 | global batch size: 48 | lm loss: 6.400149E+00 | loss scale: 4096.0 | grad norm: 69554.344 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42347 |
+
time (ms)
|
42348 |
+
iteration 5335/ 159576 | consumed samples: 137952 | elapsed time per iteration (ms): 15422.0 | learning rate: 3.818E-05 | global batch size: 48 | lm loss: 6.473186E+00 | loss scale: 4096.0 | grad norm: 96185.543 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42349 |
+
time (ms)
|
42350 |
+
iteration 5336/ 159576 | consumed samples: 138000 | elapsed time per iteration (ms): 15442.7 | learning rate: 3.819E-05 | global batch size: 48 | lm loss: 6.552884E+00 | loss scale: 4096.0 | grad norm: 73254.921 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42351 |
+
time (ms)
|
42352 |
+
iteration 5337/ 159576 | consumed samples: 138048 | elapsed time per iteration (ms): 15634.6 | learning rate: 3.821E-05 | global batch size: 48 | lm loss: 6.365612E+00 | loss scale: 4096.0 | grad norm: 57539.381 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42353 |
+
time (ms)
|
42354 |
+
iteration 5338/ 159576 | consumed samples: 138096 | elapsed time per iteration (ms): 15386.8 | learning rate: 3.822E-05 | global batch size: 48 | lm loss: 6.445109E+00 | loss scale: 4096.0 | grad norm: 67382.289 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42355 |
+
time (ms)
|
42356 |
+
iteration 5339/ 159576 | consumed samples: 138144 | elapsed time per iteration (ms): 15470.1 | learning rate: 3.823E-05 | global batch size: 48 | lm loss: 6.353713E+00 | loss scale: 4096.0 | grad norm: 110272.660 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42357 |
+
time (ms)
|
42358 |
+
iteration 5340/ 159576 | consumed samples: 138192 | elapsed time per iteration (ms): 15791.0 | learning rate: 3.825E-05 | global batch size: 48 | lm loss: 6.413539E+00 | loss scale: 4096.0 | grad norm: 72349.998 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42359 |
+
time (ms)
|
42360 |
+
iteration 5341/ 159576 | consumed samples: 138240 | elapsed time per iteration (ms): 15411.4 | learning rate: 3.826E-05 | global batch size: 48 | lm loss: 6.347322E+00 | loss scale: 4096.0 | grad norm: 61859.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42361 |
+
time (ms)
|
42362 |
+
iteration 5342/ 159576 | consumed samples: 138288 | elapsed time per iteration (ms): 15471.9 | learning rate: 3.827E-05 | global batch size: 48 | lm loss: 6.298682E+00 | loss scale: 4096.0 | grad norm: 78125.812 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42363 |
+
time (ms)
|
42364 |
+
iteration 5343/ 159576 | consumed samples: 138336 | elapsed time per iteration (ms): 15450.5 | learning rate: 3.829E-05 | global batch size: 48 | lm loss: 6.346509E+00 | loss scale: 4096.0 | grad norm: 76921.340 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42365 |
+
time (ms)
|
42366 |
+
iteration 5344/ 159576 | consumed samples: 138384 | elapsed time per iteration (ms): 15797.4 | learning rate: 3.830E-05 | global batch size: 48 | lm loss: 6.464560E+00 | loss scale: 4096.0 | grad norm: 73833.261 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42367 |
+
time (ms)
|
42368 |
+
iteration 5345/ 159576 | consumed samples: 138432 | elapsed time per iteration (ms): 15447.3 | learning rate: 3.831E-05 | global batch size: 48 | lm loss: 6.491942E+00 | loss scale: 4096.0 | grad norm: 58609.094 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42369 |
+
time (ms)
|
42370 |
+
iteration 5346/ 159576 | consumed samples: 138480 | elapsed time per iteration (ms): 15470.6 | learning rate: 3.833E-05 | global batch size: 48 | lm loss: 6.408776E+00 | loss scale: 4096.0 | grad norm: 61084.726 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42371 |
+
time (ms)
|
42372 |
+
iteration 5347/ 159576 | consumed samples: 138528 | elapsed time per iteration (ms): 15595.7 | learning rate: 3.834E-05 | global batch size: 48 | lm loss: 6.317072E+00 | loss scale: 4096.0 | grad norm: 79107.564 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42373 |
+
time (ms)
|
42374 |
+
iteration 5348/ 159576 | consumed samples: 138576 | elapsed time per iteration (ms): 15857.5 | learning rate: 3.835E-05 | global batch size: 48 | lm loss: 6.342214E+00 | loss scale: 4096.0 | grad norm: 82396.508 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42375 |
+
time (ms)
|
42376 |
+
iteration 5349/ 159576 | consumed samples: 138624 | elapsed time per iteration (ms): 15501.3 | learning rate: 3.837E-05 | global batch size: 48 | lm loss: 6.416060E+00 | loss scale: 4096.0 | grad norm: 58909.391 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42377 |
+
time (ms)
|
42378 |
+
iteration 5350/ 159576 | consumed samples: 138672 | elapsed time per iteration (ms): 15334.9 | learning rate: 3.838E-05 | global batch size: 48 | lm loss: 6.348287E+00 | loss scale: 4096.0 | grad norm: 54069.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42379 |
+
time (ms)
|
42380 |
+
iteration 5351/ 159576 | consumed samples: 138720 | elapsed time per iteration (ms): 15454.2 | learning rate: 3.839E-05 | global batch size: 48 | lm loss: 6.456007E+00 | loss scale: 4096.0 | grad norm: 61307.306 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42381 |
+
time (ms)
|
42382 |
+
iteration 5352/ 159576 | consumed samples: 138768 | elapsed time per iteration (ms): 15972.1 | learning rate: 3.841E-05 | global batch size: 48 | lm loss: 6.276731E+00 | loss scale: 4096.0 | grad norm: 62789.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42383 |
+
time (ms)
|
42384 |
+
iteration 5353/ 159576 | consumed samples: 138816 | elapsed time per iteration (ms): 15447.0 | learning rate: 3.842E-05 | global batch size: 48 | lm loss: 6.443192E+00 | loss scale: 4096.0 | grad norm: 75454.112 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42385 |
+
time (ms)
|
42386 |
+
iteration 5354/ 159576 | consumed samples: 138864 | elapsed time per iteration (ms): 15426.1 | learning rate: 3.843E-05 | global batch size: 48 | lm loss: 6.301665E+00 | loss scale: 4096.0 | grad norm: 66381.021 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42387 |
+
time (ms)
|
42388 |
+
iteration 5355/ 159576 | consumed samples: 138912 | elapsed time per iteration (ms): 15465.4 | learning rate: 3.845E-05 | global batch size: 48 | lm loss: 6.453572E+00 | loss scale: 4096.0 | grad norm: 63236.178 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42389 |
+
time (ms)
|
42390 |
+
iteration 5356/ 159576 | consumed samples: 138960 | elapsed time per iteration (ms): 15595.7 | learning rate: 3.846E-05 | global batch size: 48 | lm loss: 6.391494E+00 | loss scale: 4096.0 | grad norm: 78457.049 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42391 |
+
time (ms)
|
42392 |
+
iteration 5357/ 159576 | consumed samples: 139008 | elapsed time per iteration (ms): 15508.4 | learning rate: 3.847E-05 | global batch size: 48 | lm loss: 6.379974E+00 | loss scale: 4096.0 | grad norm: 85282.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42393 |
+
time (ms)
|
42394 |
+
iteration 5358/ 159576 | consumed samples: 139056 | elapsed time per iteration (ms): 15495.7 | learning rate: 3.849E-05 | global batch size: 48 | lm loss: 6.517261E+00 | loss scale: 4096.0 | grad norm: 75329.391 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42395 |
+
time (ms)
|
42396 |
+
iteration 5359/ 159576 | consumed samples: 139104 | elapsed time per iteration (ms): 15455.1 | learning rate: 3.850E-05 | global batch size: 48 | lm loss: 6.311386E+00 | loss scale: 4096.0 | grad norm: 74599.792 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42397 |
+
time (ms)
|
42398 |
+
iteration 5360/ 159576 | consumed samples: 139152 | elapsed time per iteration (ms): 15693.4 | learning rate: 3.851E-05 | global batch size: 48 | lm loss: 6.481428E+00 | loss scale: 4096.0 | grad norm: 77215.648 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42399 |
+
time (ms)
|
42400 |
+
iteration 5361/ 159576 | consumed samples: 139200 | elapsed time per iteration (ms): 15475.6 | learning rate: 3.853E-05 | global batch size: 48 | lm loss: 6.331719E+00 | loss scale: 4096.0 | grad norm: 60279.803 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42401 |
+
time (ms)
|
42402 |
+
iteration 5362/ 159576 | consumed samples: 139248 | elapsed time per iteration (ms): 15551.6 | learning rate: 3.854E-05 | global batch size: 48 | lm loss: 6.506707E+00 | loss scale: 4096.0 | grad norm: 57442.387 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42403 |
+
time (ms)
|
42404 |
+
iteration 5363/ 159576 | consumed samples: 139296 | elapsed time per iteration (ms): 15525.0 | learning rate: 3.855E-05 | global batch size: 48 | lm loss: 6.283090E+00 | loss scale: 4096.0 | grad norm: 69167.961 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42405 |
+
time (ms)
|
42406 |
+
iteration 5364/ 159576 | consumed samples: 139344 | elapsed time per iteration (ms): 15703.9 | learning rate: 3.857E-05 | global batch size: 48 | lm loss: 6.344968E+00 | loss scale: 4096.0 | grad norm: 66351.451 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42407 |
+
time (ms)
|
42408 |
+
iteration 5365/ 159576 | consumed samples: 139392 | elapsed time per iteration (ms): 15511.9 | learning rate: 3.858E-05 | global batch size: 48 | lm loss: 6.402239E+00 | loss scale: 4096.0 | grad norm: 69893.747 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42409 |
+
time (ms)
|
42410 |
+
iteration 5366/ 159576 | consumed samples: 139440 | elapsed time per iteration (ms): 15507.6 | learning rate: 3.859E-05 | global batch size: 48 | lm loss: 6.510591E+00 | loss scale: 4096.0 | grad norm: 73294.922 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42411 |
+
time (ms)
|
42412 |
+
iteration 5367/ 159576 | consumed samples: 139488 | elapsed time per iteration (ms): 15841.0 | learning rate: 3.861E-05 | global batch size: 48 | lm loss: 6.292207E+00 | loss scale: 4096.0 | grad norm: 69220.189 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42413 |
+
time (ms)
|
42414 |
+
iteration 5368/ 159576 | consumed samples: 139536 | elapsed time per iteration (ms): 15748.2 | learning rate: 3.862E-05 | global batch size: 48 | lm loss: 6.492587E+00 | loss scale: 4096.0 | grad norm: 78294.485 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42415 |
+
time (ms)
|
42416 |
+
iteration 5369/ 159576 | consumed samples: 139584 | elapsed time per iteration (ms): 15492.3 | learning rate: 3.863E-05 | global batch size: 48 | lm loss: 6.493845E+00 | loss scale: 4096.0 | grad norm: 94517.294 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42417 |
+
time (ms)
|
42418 |
+
iteration 5370/ 159576 | consumed samples: 139632 | elapsed time per iteration (ms): 15493.8 | learning rate: 3.864E-05 | global batch size: 48 | lm loss: 6.430061E+00 | loss scale: 4096.0 | grad norm: 77523.471 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42419 |
+
time (ms)
|
42420 |
+
iteration 5371/ 159576 | consumed samples: 139680 | elapsed time per iteration (ms): 15870.2 | learning rate: 3.866E-05 | global batch size: 48 | lm loss: 6.411311E+00 | loss scale: 4096.0 | grad norm: 69582.630 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42421 |
+
time (ms)
|
42422 |
+
iteration 5372/ 159576 | consumed samples: 139728 | elapsed time per iteration (ms): 15517.9 | learning rate: 3.867E-05 | global batch size: 48 | lm loss: 6.515477E+00 | loss scale: 4096.0 | grad norm: 75626.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42423 |
+
time (ms)
|
42424 |
+
iteration 5373/ 159576 | consumed samples: 139776 | elapsed time per iteration (ms): 15491.8 | learning rate: 3.868E-05 | global batch size: 48 | lm loss: 6.453342E+00 | loss scale: 4096.0 | grad norm: 69940.821 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42425 |
+
time (ms)
|
42426 |
+
iteration 5374/ 159576 | consumed samples: 139824 | elapsed time per iteration (ms): 15511.6 | learning rate: 3.870E-05 | global batch size: 48 | lm loss: 6.378087E+00 | loss scale: 4096.0 | grad norm: 70420.660 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42427 |
+
time (ms)
|
42428 |
+
iteration 5375/ 159576 | consumed samples: 139872 | elapsed time per iteration (ms): 15836.7 | learning rate: 3.871E-05 | global batch size: 48 | lm loss: 6.371119E+00 | loss scale: 4096.0 | grad norm: 56046.647 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42429 |
+
time (ms)
|
42430 |
+
iteration 5376/ 159576 | consumed samples: 139920 | elapsed time per iteration (ms): 15468.7 | learning rate: 3.872E-05 | global batch size: 48 | lm loss: 6.480386E+00 | loss scale: 4096.0 | grad norm: 67254.408 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42431 |
+
time (ms)
|
42432 |
+
iteration 5377/ 159576 | consumed samples: 139968 | elapsed time per iteration (ms): 15505.8 | learning rate: 3.874E-05 | global batch size: 48 | lm loss: 6.445705E+00 | loss scale: 4096.0 | grad norm: 58120.342 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42433 |
+
time (ms)
|
42434 |
+
iteration 5378/ 159576 | consumed samples: 140016 | elapsed time per iteration (ms): 15512.2 | learning rate: 3.875E-05 | global batch size: 48 | lm loss: 6.383876E+00 | loss scale: 4096.0 | grad norm: 63811.158 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42435 |
+
time (ms)
|
42436 |
+
iteration 5379/ 159576 | consumed samples: 140064 | elapsed time per iteration (ms): 15885.3 | learning rate: 3.876E-05 | global batch size: 48 | lm loss: 6.430426E+00 | loss scale: 4096.0 | grad norm: 71627.105 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42437 |
+
time (ms)
|
42438 |
+
iteration 5380/ 159576 | consumed samples: 140112 | elapsed time per iteration (ms): 15514.4 | learning rate: 3.878E-05 | global batch size: 48 | lm loss: 6.352599E+00 | loss scale: 4096.0 | grad norm: 55768.573 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42439 |
+
time (ms)
|
42440 |
+
iteration 5381/ 159576 | consumed samples: 140160 | elapsed time per iteration (ms): 15536.5 | learning rate: 3.879E-05 | global batch size: 48 | lm loss: 6.462265E+00 | loss scale: 4096.0 | grad norm: 76307.339 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42441 |
+
time (ms)
|
42442 |
+
iteration 5382/ 159576 | consumed samples: 140208 | elapsed time per iteration (ms): 15499.8 | learning rate: 3.880E-05 | global batch size: 48 | lm loss: 6.439154E+00 | loss scale: 4096.0 | grad norm: 97619.861 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42443 |
+
time (ms)
|
42444 |
+
iteration 5383/ 159576 | consumed samples: 140256 | elapsed time per iteration (ms): 15693.9 | learning rate: 3.882E-05 | global batch size: 48 | lm loss: 6.327425E+00 | loss scale: 4096.0 | grad norm: 69803.211 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42445 |
+
time (ms)
|
42446 |
+
iteration 5384/ 159576 | consumed samples: 140304 | elapsed time per iteration (ms): 15550.5 | learning rate: 3.883E-05 | global batch size: 48 | lm loss: 6.391693E+00 | loss scale: 4096.0 | grad norm: 66211.348 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42447 |
+
time (ms)
|
42448 |
+
iteration 5385/ 159576 | consumed samples: 140352 | elapsed time per iteration (ms): 15520.0 | learning rate: 3.884E-05 | global batch size: 48 | lm loss: 6.323473E+00 | loss scale: 4096.0 | grad norm: 68034.810 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42449 |
+
time (ms)
|
42450 |
+
iteration 5386/ 159576 | consumed samples: 140400 | elapsed time per iteration (ms): 15545.0 | learning rate: 3.886E-05 | global batch size: 48 | lm loss: 6.299393E+00 | loss scale: 4096.0 | grad norm: 85492.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42451 |
+
time (ms)
|
42452 |
+
iteration 5387/ 159576 | consumed samples: 140448 | elapsed time per iteration (ms): 15684.9 | learning rate: 3.887E-05 | global batch size: 48 | lm loss: 6.374225E+00 | loss scale: 4096.0 | grad norm: 72949.757 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42453 |
+
time (ms)
|
42454 |
+
iteration 5388/ 159576 | consumed samples: 140496 | elapsed time per iteration (ms): 15553.2 | learning rate: 3.888E-05 | global batch size: 48 | lm loss: 6.446224E+00 | loss scale: 4096.0 | grad norm: 83315.401 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42455 |
+
time (ms)
|
42456 |
+
iteration 5389/ 159576 | consumed samples: 140544 | elapsed time per iteration (ms): 15520.1 | learning rate: 3.890E-05 | global batch size: 48 | lm loss: 6.336344E+00 | loss scale: 4096.0 | grad norm: 60566.619 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42457 |
+
time (ms)
|
42458 |
+
iteration 5390/ 159576 | consumed samples: 140592 | elapsed time per iteration (ms): 15438.2 | learning rate: 3.891E-05 | global batch size: 48 | lm loss: 6.437949E+00 | loss scale: 4096.0 | grad norm: 93800.672 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42459 |
+
time (ms)
|
42460 |
+
iteration 5391/ 159576 | consumed samples: 140640 | elapsed time per iteration (ms): 15842.4 | learning rate: 3.892E-05 | global batch size: 48 | lm loss: 6.445059E+00 | loss scale: 4096.0 | grad norm: 67207.362 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42461 |
+
time (ms)
|
42462 |
+
iteration 5392/ 159576 | consumed samples: 140688 | elapsed time per iteration (ms): 15543.4 | learning rate: 3.894E-05 | global batch size: 48 | lm loss: 6.340952E+00 | loss scale: 4096.0 | grad norm: 92289.634 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42463 |
+
time (ms)
|
42464 |
+
iteration 5393/ 159576 | consumed samples: 140736 | elapsed time per iteration (ms): 15518.9 | learning rate: 3.895E-05 | global batch size: 48 | lm loss: 6.416577E+00 | loss scale: 4096.0 | grad norm: 84099.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42465 |
+
time (ms)
|
42466 |
+
iteration 5394/ 159576 | consumed samples: 140784 | elapsed time per iteration (ms): 15997.3 | learning rate: 3.896E-05 | global batch size: 48 | lm loss: 6.439622E+00 | loss scale: 4096.0 | grad norm: 54809.573 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42467 |
+
time (ms)
|
42468 |
+
iteration 5395/ 159576 | consumed samples: 140832 | elapsed time per iteration (ms): 15450.3 | learning rate: 3.898E-05 | global batch size: 48 | lm loss: 6.441430E+00 | loss scale: 4096.0 | grad norm: 63144.662 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42469 |
+
time (ms)
|
42470 |
+
iteration 5396/ 159576 | consumed samples: 140880 | elapsed time per iteration (ms): 15568.2 | learning rate: 3.899E-05 | global batch size: 48 | lm loss: 6.424047E+00 | loss scale: 4096.0 | grad norm: 106261.057 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42471 |
+
time (ms)
|
42472 |
+
iteration 5397/ 159576 | consumed samples: 140928 | elapsed time per iteration (ms): 15464.4 | learning rate: 3.900E-05 | global batch size: 48 | lm loss: 6.325677E+00 | loss scale: 4096.0 | grad norm: 64383.277 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42473 |
+
time (ms)
|
42474 |
+
iteration 5398/ 159576 | consumed samples: 140976 | elapsed time per iteration (ms): 15883.9 | learning rate: 3.902E-05 | global batch size: 48 | lm loss: 6.582463E+00 | loss scale: 4096.0 | grad norm: 66662.490 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42475 |
+
time (ms)
|
42476 |
+
iteration 5399/ 159576 | consumed samples: 141024 | elapsed time per iteration (ms): 15497.5 | learning rate: 3.903E-05 | global batch size: 48 | lm loss: 6.498641E+00 | loss scale: 4096.0 | grad norm: 59391.511 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42477 |
+
time (ms)
|
42478 |
+
iteration 5400/ 159576 | consumed samples: 141072 | elapsed time per iteration (ms): 15569.9 | learning rate: 3.904E-05 | global batch size: 48 | lm loss: 6.283938E+00 | loss scale: 4096.0 | grad norm: 64487.813 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42479 |
+
time (ms)
|
42480 |
+
iteration 5401/ 159576 | consumed samples: 141120 | elapsed time per iteration (ms): 15526.8 | learning rate: 3.906E-05 | global batch size: 48 | lm loss: 6.336715E+00 | loss scale: 4096.0 | grad norm: 57781.336 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42481 |
+
time (ms)
|
42482 |
+
iteration 5402/ 159576 | consumed samples: 141168 | elapsed time per iteration (ms): 15981.6 | learning rate: 3.907E-05 | global batch size: 48 | lm loss: 6.293415E+00 | loss scale: 4096.0 | grad norm: 92738.567 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42483 |
+
time (ms)
|
42484 |
+
iteration 5403/ 159576 | consumed samples: 141216 | elapsed time per iteration (ms): 15632.0 | learning rate: 3.908E-05 | global batch size: 48 | lm loss: 6.294649E+00 | loss scale: 4096.0 | grad norm: 62910.047 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42485 |
+
time (ms)
|
42486 |
+
iteration 5404/ 159576 | consumed samples: 141264 | elapsed time per iteration (ms): 15497.6 | learning rate: 3.910E-05 | global batch size: 48 | lm loss: 6.331801E+00 | loss scale: 4096.0 | grad norm: 64648.240 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42487 |
+
time (ms)
|
42488 |
+
iteration 5405/ 159576 | consumed samples: 141312 | elapsed time per iteration (ms): 15498.1 | learning rate: 3.911E-05 | global batch size: 48 | lm loss: 6.406822E+00 | loss scale: 4096.0 | grad norm: 71416.233 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42489 |
+
time (ms)
|
42490 |
+
iteration 5406/ 159576 | consumed samples: 141360 | elapsed time per iteration (ms): 15867.4 | learning rate: 3.912E-05 | global batch size: 48 | lm loss: 6.404875E+00 | loss scale: 4096.0 | grad norm: 56955.709 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42491 |
+
time (ms)
|
42492 |
+
iteration 5407/ 159576 | consumed samples: 141408 | elapsed time per iteration (ms): 15506.2 | learning rate: 3.914E-05 | global batch size: 48 | lm loss: 6.428100E+00 | loss scale: 4096.0 | grad norm: 65410.012 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42493 |
+
time (ms)
|
42494 |
+
iteration 5408/ 159576 | consumed samples: 141456 | elapsed time per iteration (ms): 15573.9 | learning rate: 3.915E-05 | global batch size: 48 | lm loss: 6.352518E+00 | loss scale: 4096.0 | grad norm: 57463.162 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42495 |
+
time (ms)
|
42496 |
+
iteration 5409/ 159576 | consumed samples: 141504 | elapsed time per iteration (ms): 15570.8 | learning rate: 3.916E-05 | global batch size: 48 | lm loss: 6.276915E+00 | loss scale: 4096.0 | grad norm: 56808.465 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42497 |
+
time (ms)
|
42498 |
+
iteration 5410/ 159576 | consumed samples: 141552 | elapsed time per iteration (ms): 15647.9 | learning rate: 3.918E-05 | global batch size: 48 | lm loss: 6.388402E+00 | loss scale: 4096.0 | grad norm: 55831.269 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42499 |
+
time (ms)
|
42500 |
+
iteration 5411/ 159576 | consumed samples: 141600 | elapsed time per iteration (ms): 15527.8 | learning rate: 3.919E-05 | global batch size: 48 | lm loss: 6.359324E+00 | loss scale: 4096.0 | grad norm: 58176.863 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42501 |
+
time (ms)
|
42502 |
+
iteration 5412/ 159576 | consumed samples: 141648 | elapsed time per iteration (ms): 15485.9 | learning rate: 3.920E-05 | global batch size: 48 | lm loss: 6.410316E+00 | loss scale: 4096.0 | grad norm: 58797.382 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42503 |
+
time (ms)
|
42504 |
+
iteration 5413/ 159576 | consumed samples: 141696 | elapsed time per iteration (ms): 15570.6 | learning rate: 3.922E-05 | global batch size: 48 | lm loss: 6.487602E+00 | loss scale: 4096.0 | grad norm: 54779.384 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42505 |
+
time (ms)
|
42506 |
+
iteration 5414/ 159576 | consumed samples: 141744 | elapsed time per iteration (ms): 15692.4 | learning rate: 3.923E-05 | global batch size: 48 | lm loss: 6.538764E+00 | loss scale: 4096.0 | grad norm: 56952.810 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42507 |
+
time (ms)
|
42508 |
+
iteration 5415/ 159576 | consumed samples: 141808 | elapsed time per iteration (ms): 16423.4 | learning rate: 3.925E-05 | global batch size: 64 | lm loss: 6.468464E+00 | loss scale: 4096.0 | grad norm: 47962.953 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42509 |
+
time (ms)
|
42510 |
+
iteration 5416/ 159576 | consumed samples: 141872 | elapsed time per iteration (ms): 16486.4 | learning rate: 3.927E-05 | global batch size: 64 | lm loss: 6.358836E+00 | loss scale: 4096.0 | grad norm: 79746.041 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42511 |
+
time (ms)
|
42512 |
+
iteration 5417/ 159576 | consumed samples: 141936 | elapsed time per iteration (ms): 16837.9 | learning rate: 3.928E-05 | global batch size: 64 | lm loss: 6.458796E+00 | loss scale: 4096.0 | grad norm: 72485.233 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42513 |
+
time (ms)
|
42514 |
+
iteration 5418/ 159576 | consumed samples: 142000 | elapsed time per iteration (ms): 16282.1 | learning rate: 3.930E-05 | global batch size: 64 | lm loss: 6.325031E+00 | loss scale: 4096.0 | grad norm: 50657.294 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42515 |
+
time (ms)
|
42516 |
+
iteration 5419/ 159576 | consumed samples: 142064 | elapsed time per iteration (ms): 16473.5 | learning rate: 3.932E-05 | global batch size: 64 | lm loss: 6.393603E+00 | loss scale: 4096.0 | grad norm: 53317.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42517 |
+
time (ms)
|
42518 |
+
iteration 5420/ 159576 | consumed samples: 142128 | elapsed time per iteration (ms): 16358.3 | learning rate: 3.934E-05 | global batch size: 64 | lm loss: 6.505975E+00 | loss scale: 4096.0 | grad norm: 76759.970 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42519 |
+
time (ms)
|
42520 |
+
iteration 5421/ 159576 | consumed samples: 142192 | elapsed time per iteration (ms): 16646.9 | learning rate: 3.936E-05 | global batch size: 64 | lm loss: 6.377459E+00 | loss scale: 4096.0 | grad norm: 61658.865 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42521 |
+
time (ms)
|
42522 |
+
iteration 5422/ 159576 | consumed samples: 142256 | elapsed time per iteration (ms): 16480.4 | learning rate: 3.937E-05 | global batch size: 64 | lm loss: 6.350579E+00 | loss scale: 4096.0 | grad norm: 61672.596 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42523 |
+
time (ms)
|
42524 |
+
iteration 5423/ 159576 | consumed samples: 142320 | elapsed time per iteration (ms): 16500.8 | learning rate: 3.939E-05 | global batch size: 64 | lm loss: 6.359305E+00 | loss scale: 4096.0 | grad norm: 71934.386 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42525 |
+
time (ms)
|
42526 |
+
iteration 5424/ 159576 | consumed samples: 142384 | elapsed time per iteration (ms): 16400.7 | learning rate: 3.941E-05 | global batch size: 64 | lm loss: 6.515474E+00 | loss scale: 4096.0 | grad norm: 62262.598 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42527 |
+
time (ms)
|
42528 |
+
iteration 5425/ 159576 | consumed samples: 142448 | elapsed time per iteration (ms): 16686.7 | learning rate: 3.943E-05 | global batch size: 64 | lm loss: 6.377324E+00 | loss scale: 4096.0 | grad norm: 66128.264 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42529 |
+
time (ms)
|
42530 |
+
iteration 5426/ 159576 | consumed samples: 142512 | elapsed time per iteration (ms): 16346.9 | learning rate: 3.944E-05 | global batch size: 64 | lm loss: 6.394655E+00 | loss scale: 4096.0 | grad norm: 64276.983 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42531 |
+
time (ms)
|
42532 |
+
iteration 5427/ 159576 | consumed samples: 142576 | elapsed time per iteration (ms): 16454.0 | learning rate: 3.946E-05 | global batch size: 64 | lm loss: 6.417256E+00 | loss scale: 4096.0 | grad norm: 55916.762 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42533 |
+
time (ms)
|
42534 |
+
iteration 5428/ 159576 | consumed samples: 142640 | elapsed time per iteration (ms): 16713.8 | learning rate: 3.948E-05 | global batch size: 64 | lm loss: 6.314127E+00 | loss scale: 4096.0 | grad norm: 65443.157 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42535 |
+
time (ms)
|
42536 |
+
iteration 5429/ 159576 | consumed samples: 142704 | elapsed time per iteration (ms): 16492.7 | learning rate: 3.950E-05 | global batch size: 64 | lm loss: 6.349669E+00 | loss scale: 4096.0 | grad norm: 64819.083 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42537 |
+
time (ms)
|
42538 |
+
iteration 5430/ 159576 | consumed samples: 142768 | elapsed time per iteration (ms): 16430.1 | learning rate: 3.951E-05 | global batch size: 64 | lm loss: 6.406096E+00 | loss scale: 4096.0 | grad norm: 72027.252 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42539 |
+
time (ms)
|
42540 |
+
iteration 5431/ 159576 | consumed samples: 142832 | elapsed time per iteration (ms): 16452.9 | learning rate: 3.953E-05 | global batch size: 64 | lm loss: 6.422045E+00 | loss scale: 4096.0 | grad norm: 59470.191 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42541 |
+
time (ms)
|
42542 |
+
iteration 5432/ 159576 | consumed samples: 142896 | elapsed time per iteration (ms): 16574.0 | learning rate: 3.955E-05 | global batch size: 64 | lm loss: 6.384964E+00 | loss scale: 4096.0 | grad norm: 59229.555 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42543 |
+
time (ms)
|
42544 |
+
iteration 5433/ 159576 | consumed samples: 142960 | elapsed time per iteration (ms): 16448.4 | learning rate: 3.957E-05 | global batch size: 64 | lm loss: 6.388242E+00 | loss scale: 4096.0 | grad norm: 51139.017 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42545 |
+
time (ms)
|
42546 |
+
iteration 5434/ 159576 | consumed samples: 143024 | elapsed time per iteration (ms): 16378.2 | learning rate: 3.959E-05 | global batch size: 64 | lm loss: 6.422913E+00 | loss scale: 4096.0 | grad norm: 55548.958 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42547 |
+
time (ms)
|
42548 |
+
iteration 5435/ 159576 | consumed samples: 143088 | elapsed time per iteration (ms): 16838.8 | learning rate: 3.960E-05 | global batch size: 64 | lm loss: 6.399693E+00 | loss scale: 4096.0 | grad norm: 87728.143 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42549 |
+
time (ms)
|
42550 |
+
iteration 5436/ 159576 | consumed samples: 143152 | elapsed time per iteration (ms): 16458.9 | learning rate: 3.962E-05 | global batch size: 64 | lm loss: 6.291359E+00 | loss scale: 4096.0 | grad norm: 65955.697 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42551 |
+
time (ms)
|
42552 |
+
iteration 5437/ 159576 | consumed samples: 143216 | elapsed time per iteration (ms): 16425.2 | learning rate: 3.964E-05 | global batch size: 64 | lm loss: 6.367932E+00 | loss scale: 4096.0 | grad norm: 63150.328 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42553 |
+
time (ms)
|
42554 |
+
iteration 5438/ 159576 | consumed samples: 143280 | elapsed time per iteration (ms): 16418.8 | learning rate: 3.966E-05 | global batch size: 64 | lm loss: 6.365756E+00 | loss scale: 4096.0 | grad norm: 57427.195 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42555 |
+
time (ms)
|
42556 |
+
iteration 5439/ 159576 | consumed samples: 143344 | elapsed time per iteration (ms): 16802.3 | learning rate: 3.967E-05 | global batch size: 64 | lm loss: 6.415596E+00 | loss scale: 4096.0 | grad norm: 61605.287 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42557 |
+
time (ms)
|
42558 |
+
iteration 5440/ 159576 | consumed samples: 143408 | elapsed time per iteration (ms): 16516.9 | learning rate: 3.969E-05 | global batch size: 64 | lm loss: 6.414165E+00 | loss scale: 4096.0 | grad norm: 64434.632 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42559 |
+
time (ms)
|
42560 |
+
iteration 5441/ 159576 | consumed samples: 143472 | elapsed time per iteration (ms): 16398.0 | learning rate: 3.971E-05 | global batch size: 64 | lm loss: 6.425170E+00 | loss scale: 4096.0 | grad norm: 63830.236 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42561 |
+
time (ms)
|
42562 |
+
iteration 5442/ 159576 | consumed samples: 143536 | elapsed time per iteration (ms): 16330.0 | learning rate: 3.973E-05 | global batch size: 64 | lm loss: 6.420317E+00 | loss scale: 4096.0 | grad norm: 80818.483 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42563 |
+
time (ms)
|
42564 |
+
iteration 5443/ 159576 | consumed samples: 143600 | elapsed time per iteration (ms): 16646.2 | learning rate: 3.975E-05 | global batch size: 64 | lm loss: 6.404300E+00 | loss scale: 4096.0 | grad norm: 66058.957 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42565 |
+
time (ms)
|
42566 |
+
iteration 5444/ 159576 | consumed samples: 143664 | elapsed time per iteration (ms): 16389.9 | learning rate: 3.976E-05 | global batch size: 64 | lm loss: 6.307170E+00 | loss scale: 4096.0 | grad norm: 64553.082 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42567 |
+
time (ms)
|
42568 |
+
iteration 5445/ 159576 | consumed samples: 143728 | elapsed time per iteration (ms): 16425.8 | learning rate: 3.978E-05 | global batch size: 64 | lm loss: 6.474117E+00 | loss scale: 4096.0 | grad norm: 54414.389 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42569 |
+
time (ms)
|
42570 |
+
iteration 5446/ 159576 | consumed samples: 143792 | elapsed time per iteration (ms): 16855.6 | learning rate: 3.980E-05 | global batch size: 64 | lm loss: 6.329272E+00 | loss scale: 4096.0 | grad norm: 67896.275 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42571 |
+
time (ms)
|
42572 |
+
iteration 5447/ 159576 | consumed samples: 143856 | elapsed time per iteration (ms): 16363.1 | learning rate: 3.982E-05 | global batch size: 64 | lm loss: 6.485427E+00 | loss scale: 4096.0 | grad norm: 55200.098 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42573 |
+
time (ms)
|
42574 |
+
iteration 5448/ 159576 | consumed samples: 143920 | elapsed time per iteration (ms): 16446.4 | learning rate: 3.983E-05 | global batch size: 64 | lm loss: 6.474103E+00 | loss scale: 4096.0 | grad norm: 58759.422 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42575 |
+
time (ms)
|
42576 |
+
iteration 5449/ 159576 | consumed samples: 143984 | elapsed time per iteration (ms): 16365.5 | learning rate: 3.985E-05 | global batch size: 64 | lm loss: 6.386650E+00 | loss scale: 4096.0 | grad norm: 69075.558 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42577 |
+
time (ms)
|
42578 |
+
iteration 5450/ 159576 | consumed samples: 144048 | elapsed time per iteration (ms): 16855.4 | learning rate: 3.987E-05 | global batch size: 64 | lm loss: 6.407839E+00 | loss scale: 4096.0 | grad norm: 76751.714 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42579 |
+
time (ms)
|
42580 |
+
iteration 5451/ 159576 | consumed samples: 144112 | elapsed time per iteration (ms): 16481.2 | learning rate: 3.989E-05 | global batch size: 64 | lm loss: 6.437217E+00 | loss scale: 4096.0 | grad norm: 60762.834 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42581 |
+
time (ms)
|
42582 |
+
iteration 5452/ 159576 | consumed samples: 144176 | elapsed time per iteration (ms): 16387.3 | learning rate: 3.991E-05 | global batch size: 64 | lm loss: 6.391966E+00 | loss scale: 4096.0 | grad norm: 57835.999 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42583 |
+
time (ms)
|
42584 |
+
iteration 5453/ 159576 | consumed samples: 144240 | elapsed time per iteration (ms): 16456.9 | learning rate: 3.992E-05 | global batch size: 64 | lm loss: 6.407461E+00 | loss scale: 4096.0 | grad norm: 56276.948 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42585 |
+
time (ms)
|
42586 |
+
iteration 5454/ 159576 | consumed samples: 144304 | elapsed time per iteration (ms): 16533.3 | learning rate: 3.994E-05 | global batch size: 64 | lm loss: 6.319425E+00 | loss scale: 4096.0 | grad norm: 66856.562 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42587 |
+
time (ms)
|
42588 |
+
iteration 5455/ 159576 | consumed samples: 144368 | elapsed time per iteration (ms): 16417.1 | learning rate: 3.996E-05 | global batch size: 64 | lm loss: 6.377168E+00 | loss scale: 4096.0 | grad norm: 53863.935 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42589 |
+
time (ms)
|
42590 |
+
iteration 5456/ 159576 | consumed samples: 144432 | elapsed time per iteration (ms): 16422.1 | learning rate: 3.998E-05 | global batch size: 64 | lm loss: 6.368913E+00 | loss scale: 4096.0 | grad norm: 63261.354 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42591 |
+
time (ms)
|
42592 |
+
iteration 5457/ 159576 | consumed samples: 144496 | elapsed time per iteration (ms): 16738.2 | learning rate: 3.999E-05 | global batch size: 64 | lm loss: 6.264383E+00 | loss scale: 4096.0 | grad norm: 64656.043 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42593 |
+
time (ms)
|
42594 |
+
iteration 5458/ 159576 | consumed samples: 144560 | elapsed time per iteration (ms): 16315.9 | learning rate: 4.001E-05 | global batch size: 64 | lm loss: 6.410008E+00 | loss scale: 4096.0 | grad norm: 82472.599 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42595 |
+
time (ms)
|
42596 |
+
iteration 5459/ 159576 | consumed samples: 144624 | elapsed time per iteration (ms): 16385.7 | learning rate: 4.003E-05 | global batch size: 64 | lm loss: 6.419100E+00 | loss scale: 4096.0 | grad norm: 81581.674 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42597 |
+
time (ms)
|
42598 |
+
iteration 5460/ 159576 | consumed samples: 144688 | elapsed time per iteration (ms): 16422.6 | learning rate: 4.005E-05 | global batch size: 64 | lm loss: 6.374327E+00 | loss scale: 4096.0 | grad norm: 77883.993 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42599 |
+
time (ms)
|
42600 |
+
iteration 5461/ 159576 | consumed samples: 144752 | elapsed time per iteration (ms): 16514.0 | learning rate: 4.007E-05 | global batch size: 64 | lm loss: 6.323710E+00 | loss scale: 4096.0 | grad norm: 59535.385 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42601 |
+
time (ms)
|
42602 |
+
iteration 5462/ 159576 | consumed samples: 144816 | elapsed time per iteration (ms): 16520.4 | learning rate: 4.008E-05 | global batch size: 64 | lm loss: 6.325150E+00 | loss scale: 4096.0 | grad norm: 54807.099 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42603 |
+
time (ms)
|
42604 |
+
iteration 5463/ 159576 | consumed samples: 144880 | elapsed time per iteration (ms): 16362.9 | learning rate: 4.010E-05 | global batch size: 64 | lm loss: 6.461391E+00 | loss scale: 4096.0 | grad norm: 74839.084 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42605 |
+
time (ms)
|
42606 |
+
iteration 5464/ 159576 | consumed samples: 144944 | elapsed time per iteration (ms): 16408.3 | learning rate: 4.012E-05 | global batch size: 64 | lm loss: 6.392217E+00 | loss scale: 4096.0 | grad norm: 61727.667 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42607 |
+
time (ms)
|
42608 |
+
iteration 5465/ 159576 | consumed samples: 145008 | elapsed time per iteration (ms): 16556.8 | learning rate: 4.014E-05 | global batch size: 64 | lm loss: 6.349445E+00 | loss scale: 4096.0 | grad norm: 90938.249 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42609 |
+
time (ms)
|
42610 |
+
iteration 5466/ 159576 | consumed samples: 145072 | elapsed time per iteration (ms): 16389.1 | learning rate: 4.015E-05 | global batch size: 64 | lm loss: 6.314983E+00 | loss scale: 4096.0 | grad norm: 62408.172 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42611 |
+
time (ms)
|
42612 |
+
iteration 5467/ 159576 | consumed samples: 145136 | elapsed time per iteration (ms): 16364.1 | learning rate: 4.017E-05 | global batch size: 64 | lm loss: 6.412921E+00 | loss scale: 4096.0 | grad norm: 82535.193 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42613 |
+
time (ms)
|
42614 |
+
iteration 5468/ 159576 | consumed samples: 145200 | elapsed time per iteration (ms): 16712.9 | learning rate: 4.019E-05 | global batch size: 64 | lm loss: 6.508467E+00 | loss scale: 4096.0 | grad norm: 53388.956 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42615 |
+
time (ms)
|
42616 |
+
iteration 5469/ 159576 | consumed samples: 145264 | elapsed time per iteration (ms): 16357.7 | learning rate: 4.021E-05 | global batch size: 64 | lm loss: 6.367021E+00 | loss scale: 4096.0 | grad norm: 88053.691 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42617 |
+
time (ms)
|
42618 |
+
iteration 5470/ 159576 | consumed samples: 145328 | elapsed time per iteration (ms): 16424.7 | learning rate: 4.022E-05 | global batch size: 64 | lm loss: 6.396588E+00 | loss scale: 4096.0 | grad norm: 83281.076 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42619 |
+
time (ms)
|
42620 |
+
iteration 5471/ 159576 | consumed samples: 145392 | elapsed time per iteration (ms): 16363.6 | learning rate: 4.024E-05 | global batch size: 64 | lm loss: 6.387273E+00 | loss scale: 4096.0 | grad norm: 56875.433 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42621 |
+
time (ms)
|
42622 |
+
iteration 5472/ 159576 | consumed samples: 145456 | elapsed time per iteration (ms): 16523.2 | learning rate: 4.026E-05 | global batch size: 64 | lm loss: 6.456463E+00 | loss scale: 4096.0 | grad norm: 60270.862 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42623 |
+
time (ms)
|
42624 |
+
iteration 5473/ 159576 | consumed samples: 145520 | elapsed time per iteration (ms): 16398.7 | learning rate: 4.028E-05 | global batch size: 64 | lm loss: 6.460003E+00 | loss scale: 4096.0 | grad norm: 61151.257 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42625 |
+
time (ms)
|
42626 |
+
iteration 5474/ 159576 | consumed samples: 145584 | elapsed time per iteration (ms): 16345.5 | learning rate: 4.030E-05 | global batch size: 64 | lm loss: 6.443559E+00 | loss scale: 4096.0 | grad norm: 83130.420 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42627 |
+
time (ms)
|
42628 |
+
iteration 5475/ 159576 | consumed samples: 145648 | elapsed time per iteration (ms): 16591.9 | learning rate: 4.031E-05 | global batch size: 64 | lm loss: 6.454519E+00 | loss scale: 4096.0 | grad norm: 69198.394 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42629 |
+
time (ms)
|
42630 |
+
iteration 5476/ 159576 | consumed samples: 145712 | elapsed time per iteration (ms): 16643.0 | learning rate: 4.033E-05 | global batch size: 64 | lm loss: 6.424469E+00 | loss scale: 4096.0 | grad norm: 57626.523 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42631 |
+
time (ms)
|
42632 |
+
iteration 5477/ 159576 | consumed samples: 145776 | elapsed time per iteration (ms): 16362.1 | learning rate: 4.035E-05 | global batch size: 64 | lm loss: 6.285227E+00 | loss scale: 4096.0 | grad norm: 87864.793 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42633 |
+
time (ms)
|
42634 |
+
iteration 5478/ 159576 | consumed samples: 145840 | elapsed time per iteration (ms): 16435.9 | learning rate: 4.037E-05 | global batch size: 64 | lm loss: 6.372074E+00 | loss scale: 4096.0 | grad norm: 67542.980 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42635 |
+
time (ms)
|
42636 |
+
iteration 5479/ 159576 | consumed samples: 145904 | elapsed time per iteration (ms): 16597.3 | learning rate: 4.038E-05 | global batch size: 64 | lm loss: 6.438199E+00 | loss scale: 4096.0 | grad norm: 74119.106 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42637 |
+
time (ms)
|
42638 |
+
iteration 5480/ 159576 | consumed samples: 145968 | elapsed time per iteration (ms): 16483.8 | learning rate: 4.040E-05 | global batch size: 64 | lm loss: 6.487626E+00 | loss scale: 4096.0 | grad norm: 68136.689 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
42639 |
+
time (ms)
|
42640 |
+
[2021-09-25 00:07:47] PULSE: tr8-104B is waiting for the previous job to finish before scheduling a new one using the dependency mechanism (1165978_[1-10%1] on 'gpu_p13' partition)
|
42641 |
+
[2021-09-25 00:07:47] PULSE: tr8-104B is running for 18:15:36 since 2021-09-24T05:52:11 (1162855_1 on 'gpu_p13' partition (r6i4n[5,7],r6i5n[2,7-8],r6i6n[0,2,6],r7i2n[4-5],r7i6n[2-4],r7i7n[7-8],r8i0n[2-3,5-8],r8i1n[0,2-4],r8i2n8,r8i3n[0-2],r8i5n[3-4],r8i7n[3-8],r9i0n[0-2],r9i1n[0-3],r9i2n[3-5,8],r9i3n[0-1,7-8],r9i4n[0-2],r9i5n[3-8],r9i6n[0,7-8])
|