sosoai's picture
Create README.md
9d77592 verified

Finetuend SFT trainer (based model is DPO)