End of training
e21936e
verified
-
copy_teacher_modules=_(_lm_head___True)_, hs_layer_mapper=last, hs_loss_fn=mse, hs_weight=1.0, learning_rate=0.0001, per_device_train_batch_size=4
End of training
-
copy_teacher_modules=_(_lm_head___True)_, hs_layer_mapper=last, hs_loss_fn=mse, hs_weight=1.0
Training in progress, step 12375
-
dataset_subset=default, dataset_uri=distily_c4_multilingual_1M, learning_rate=0.0001, per_device_train_batch_size=4
Training in progress, step 24750
-
dataset_subset=default, dataset_uri=distily_c4_multilingual_1M
Training in progress, step 24750
-
hs_layer_mapper=last, hs_loss_fn=mse, hs_weight=1.0
End of training
-
learning_rate=0.0001, per_device_train_batch_size=4, reinitialize_weights=xavier
Training in progress, step 24750
-
learning_rate=0.0001, per_device_train_batch_size=4
End of training
-
0 Bytes
End of training
-
5.85 MB
End of training
-
578 Bytes
End of training