Saving weights and log at step 500000

Files changed (4) hide show

README.md CHANGED Viewed

@@ -26,7 +26,8 @@ Tokenizer:
 Training details:
-* Training at step 360000 of 2082009 (17%)
 * Block size: 512
 * Optimizer: adafactor
 * Learning rate: 3.3e-5

 Training details:
+* Training started on step 360K (bs 16) ppl 21 of earlier model trained with Adam optimizer.
+* Training at step 500000 of 2082009 (24%)
 * Block size: 512
 * Optimizer: adafactor
 * Learning rate: 3.3e-5

flax_model.msgpack ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:5ffe240036734a127d630a18a71cb22682941946b474d8859d22ae8cc17fa140
+size 3096134690

pytorch_model.bin ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:6fd7590b2bbcad77e477fc1690f7238b41a65f97aa34f1e2c071cc677908c763
+size 3134045897

runs/events.out.tfevents.1641055391.t1v-n-f9cfcc28-w-0.112189.0.v2 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:117900743772bb4b91532ebca51190b534c4bcc1e8255614fb29f99b0274371a
-size 60168291

 version https://git-lfs.github.com/spec/v1
+oid sha256:4edc4a3b7c6574e7fcd9975c6c9105dcf1ccce551270dfcd500e9ed484a445bb
+size 74565917