Upcycling Experiments Collection Models I pre-trained initialising SMoE models using dense model weights and the upcycling process used for Qwen1.5-MoE2.7BA (or something similar) • 6 items • Updated Apr 1