Smaug - Japanese Language Support

#11
by msmmpts - opened

Hi team,

I was wondering if Smaug model included Japanese datasets during its training phase. If Yes, could you please the Japanese contents on which Smaug model has been trained?

We did not utilise any Japanese datasets during the training of Smaug, and it does not appear as though the model we started from (https://huggingface.co./moreh/MoMo-72B-lora-1.8.7-DPO) did either.
Once we release our technique paper in a couple of weeks though you could try to replicate the process with some Japanese datasets added in :)

Yes no decent Japanese open source LLMs exist that would be nice

Sign up or log in to comment