Attempts to fill out the 1B3 model details that diverge from the main one.
#1
by
meg
HF staff
- opened
Here I am using: https://github.com/bigscience-workshop/bigscience/blob/master/train/tr11-176B-ml/smaller_models/tr11b-1B3-ml.slurm to help flesh it out.
meg
changed pull request status to
merged