Train after merging?
#1
by
adi-kmt
- opened
Other than adding a positive prompt, is it necessary to further finetune after merging to an moe?
fine-tune can improve the score again if the dataset is great.
https://huggingface.co./cloudyu/Pluto_24B_DPO_200/blob/main/dpo-metrics.jpg
Thanks for sharing, i would like to know if have you done the sft before doing dpo, and if yes on which dataset? if not, then could you tell me which comparison training you are using during dpo training ? thanks again for you sharing