Edit model card

checkpoint @ 31,404,851,200 tokens trained, settings used:

batch_size: 512
context_length: 1024
learning_rate: 2e-4
schedule: cosine with 10% warmup from 0 to 2e-4, cooldown to 0

requirements:

pytorch
transformers
einops
Downloads last month
14
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.

Dataset used to train crumb/32M-32GT-SlimPajama