Add training results
Browse files
README.md
CHANGED
@@ -21,8 +21,16 @@ This is a T5v1.1 (small) trained on the concatenation of the Arabic Billion Word
|
|
21 |
| learning rate | `1e-2` |
|
22 |
| dtype | `jnp.float32` |
|
23 |
|
|
|
24 |
|
25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
This model was pretrained with dropout turned off, so the default `dropout_rate` in the model config is `0`.
|
28 |
To finetune the model dropout should be turned be back on, like this:
|
|
|
21 |
| learning rate | `1e-2` |
|
22 |
| dtype | `jnp.float32` |
|
23 |
|
24 |
+
## Results
|
25 |
|
26 |
+
| | |
|
27 |
+
| :-----------------: | :-----------: |
|
28 |
+
| evaluation accuracy | `56.84%` |
|
29 |
+
| evaluation loss | `2.423` |
|
30 |
+
| training loss | `2.392` |
|
31 |
+
| training time | `22h 23m 51s` |
|
32 |
+
|
33 |
+
## Note for finetuning
|
34 |
|
35 |
This model was pretrained with dropout turned off, so the default `dropout_rate` in the model config is `0`.
|
36 |
To finetune the model dropout should be turned be back on, like this:
|