update(weights): train rand init model for 5k steps on cosmopedia 8adff3b verified jon-tow commited on Mar 31
fix(model): update module name for qk norm fields for consistency with phi/persimmon e557170 verified jon-tow commited on Mar 28
fix(model): update module name for qk norm fields for consistency with phi/persimmon c482eda verified jon-tow commited on Mar 28
update(config): use `qk_layerorm` field for consistency with `transformers` eb057ae verified jon-tow commited on Mar 28