t5like-60M / README.md
bri25yu's picture
Create README.md
779316b
metadata
datasets:
  - hlillemark/c4_t5_corrupted_seqlen256
language:
  - en
metrics:
  - perplexity
Hyperparameter Value
Steps 150k
Max length 256
LR 1e-4
LR schedule constant
Optimizer AdamW
beta_1, beta_2 0.9, 0.95
Final eval loss 2.245
Final eval perplexity 9.44