Update README.md
Browse files
README.md
CHANGED
@@ -30,8 +30,11 @@ We've modified Flax's 'lm1b' example to train on Japanese dataset. You can find
|
|
30 |
| Model | Params | Layers | Dim | Heads | PPL | Dataset | Training time |
|
31 |
|-|-|-|-|-|-|-|-|
|
32 |
| lm1b-default | 0.05B | 6 | 512 | 8 | 22.67 | lm1b | 0.5 days |
|
|
|
33 |
| transformer-lm-japanese-0.1b | 0.1B | 12 | 768 | 12 | 35.22 | wiki40b/ja | 1.5 days |
|
34 |
|
|
|
|
|
35 |
## Usage
|
36 |
|
37 |
Here, we explain the procedure to generate text from pretrained weights using a CPU. We used the following instance on GCE for the Python 3.8 environment.
|
|
|
30 |
| Model | Params | Layers | Dim | Heads | PPL | Dataset | Training time |
|
31 |
|-|-|-|-|-|-|-|-|
|
32 |
| lm1b-default | 0.05B | 6 | 512 | 8 | 22.67 | lm1b | 0.5 days |
|
33 |
+
| transformer-lm-japanese-default | 0.05B | 6 | 512 | 8 | 66.38 | cc100/ja | 0.5 days |
|
34 |
| transformer-lm-japanese-0.1b | 0.1B | 12 | 768 | 12 | 35.22 | wiki40b/ja | 1.5 days |
|
35 |
|
36 |
+
![tensor-board](./tensorboard-v1.png)
|
37 |
+
|
38 |
## Usage
|
39 |
|
40 |
Here, we explain the procedure to generate text from pretrained weights using a CPU. We used the following instance on GCE for the Python 3.8 environment.
|