branch:global_step95000_universal ,there is no weight file of layer0,layer1?

#1
by litchi - opened

there is no weight of layer0,layer1?
I didn't find model file as follow:
1.input_layernorm.bias
1.input_layernorm.weight
1.mlp.dense_4h_to_h.bias
1.mlp.dense_4h_to_h.weight
1.mlp.dense_h_to_4h.bias
1.mlp.dense_h_to_4h.weight
1.post_attention_layernorm.bias
1.post_attention_layernorm.weight
1.self_attention.dense.bias
1.self_attention.dense.weight
1.self_attention.query_key_value.bias
1.self_attention.query_key_value.weight
@Muennighoff @stas

litchi changed discussion status to closed
litchi changed discussion status to open
BigScience Workshop org
edited Mar 29, 2023

Not 100% sure, but I think it's because 1 is the tied embeddings which are not numbered but in tied_modules.

20230329-191114.jpg
I find model file of layer.10 is about 27GB(the same as layer.11), but tied_modules only 42GB.
layer.0 + layer.1 = 27GB x2 =54GB.

I don't think the tied_modules includes the layer0 and layer1, Is that correct?
@Muennighoff

BigScience Workshop org

I don't think there is a layer.0 (see https://huggingface.co./bigscience/bloom-optimizer-states/tree/main/global_step95000); The numbering is a bit weird because it includes layers that do not have parameters hence some numbers are missing; Just try loading it & you will see if sth is missing - I think it should work

Sign up or log in to comment