fukugawa commited on
Commit
85e238b
1 Parent(s): 54101a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -30,8 +30,11 @@ We've modified Flax's 'lm1b' example to train on Japanese dataset. You can find
30
  | Model | Params | Layers | Dim | Heads | PPL | Dataset | Training time |
31
  |-|-|-|-|-|-|-|-|
32
  | lm1b-default | 0.05B | 6 | 512 | 8 | 22.67 | lm1b | 0.5 days |
 
33
  | transformer-lm-japanese-0.1b | 0.1B | 12 | 768 | 12 | 35.22 | wiki40b/ja | 1.5 days |
34
 
 
 
35
  ## Usage
36
 
37
  Here, we explain the procedure to generate text from pretrained weights using a CPU. We used the following instance on GCE for the Python 3.8 environment.
 
30
  | Model | Params | Layers | Dim | Heads | PPL | Dataset | Training time |
31
  |-|-|-|-|-|-|-|-|
32
  | lm1b-default | 0.05B | 6 | 512 | 8 | 22.67 | lm1b | 0.5 days |
33
+ | transformer-lm-japanese-default | 0.05B | 6 | 512 | 8 | 66.38 | cc100/ja | 0.5 days |
34
  | transformer-lm-japanese-0.1b | 0.1B | 12 | 768 | 12 | 35.22 | wiki40b/ja | 1.5 days |
35
 
36
+ ![tensor-board](./tensorboard-v1.png)
37
+
38
  ## Usage
39
 
40
  Here, we explain the procedure to generate text from pretrained weights using a CPU. We used the following instance on GCE for the Python 3.8 environment.