Saving weights and log at step 500000
Browse files
README.md
CHANGED
@@ -26,7 +26,8 @@ Tokenizer:
|
|
26 |
|
27 |
Training details:
|
28 |
|
29 |
-
* Training
|
|
|
30 |
* Block size: 512
|
31 |
* Optimizer: adafactor
|
32 |
* Learning rate: 3.3e-5
|
|
|
26 |
|
27 |
Training details:
|
28 |
|
29 |
+
* Training started on step 360K (bs 16) ppl 21 of earlier model trained with Adam optimizer.
|
30 |
+
* Training at step 500000 of 2082009 (24%)
|
31 |
* Block size: 512
|
32 |
* Optimizer: adafactor
|
33 |
* Learning rate: 3.3e-5
|
flax_model.msgpack
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5ffe240036734a127d630a18a71cb22682941946b474d8859d22ae8cc17fa140
|
3 |
+
size 3096134690
|
pytorch_model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6fd7590b2bbcad77e477fc1690f7238b41a65f97aa34f1e2c071cc677908c763
|
3 |
+
size 3134045897
|
runs/events.out.tfevents.1641055391.t1v-n-f9cfcc28-w-0.112189.0.v2
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4edc4a3b7c6574e7fcd9975c6c9105dcf1ccce551270dfcd500e9ed484a445bb
|
3 |
+
size 74565917
|