Amu
/

spin-phi2

Text Generation

alignment-handbook

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Amu commited on Mar 16

Commit

0123ce0

•

1 Parent(s): 5466830

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -119,6 +119,8 @@ This model is a fine-tuned version of [microsoft/phi-2](https://huggingface.co/m
 I think SPIN not only can use on a SFT model, but also it  can use on a pretrained model.
 Therefore, I use SPIN on a pretrained model microsoft/phi-2. And I get a higher score better than origin pretrained model. You can check the [open llm leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
 **I Think the best paradigm for training a conversational Large Language Model (LLM):
 pretrain -> dpo(spin) -> sft -> dpo(spin)**

 I think SPIN not only can use on a SFT model, but also it  can use on a pretrained model.
 Therefore, I use SPIN on a pretrained model microsoft/phi-2. And I get a higher score better than origin pretrained model. You can check the [open llm leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
+But the ultrachat_200k dataset is a alignment dataset for sft model. I think there should use a alignment dataset for pretrained model.
 **I Think the best paradigm for training a conversational Large Language Model (LLM):
 pretrain -> dpo(spin) -> sft -> dpo(spin)**