BabyLM
Collection
UT Austin's model submissions to BabyLM challenge.
•
7 items
•
Updated
•
1
Lil-Bevo is UT Austin's submission to the BabyLM challenge, specifically the strict-small track.
Unigram tokenizer trained on 10M BabyLM tokens plus MAESTRO dataset for a vocab size of 16k.
deberta-small-v3
trained on mixture of MAESTRO and 10M tokens for 5 epochs.
Model continues training for 50 epochs on 10M tokens with sequence length of 128.
Model is trained for 2 epochs with targeted linguistic masking with sequence length of 512.
This README will be updated with more details soon.