AINovice2005 commited on
Commit
1eaa66d
1 Parent(s): f56be9a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -23,7 +23,6 @@ tags:
23
 
24
  ElEmperador is an ORPO-based finetune derived from the Mistral-7B-v0.1 base model.
25
 
26
- The 'ultrafeedback-binarized-preferences-cleaned' dataset was used for training, albeit a small portion was used due to GPU constraints.
27
 
28
  ## Evals:
29
  BLEU:0.209
@@ -62,5 +61,6 @@ if __name__ == "__main__":
62
 
63
  ## Results
64
 
65
- ORPO is a viable RLHF algorithm to improve the performance of your models along with SFT finetuning. It also helps in aligning the model’s outputs more closely with human preferences,
66
  leading to more user-friendly and acceptable results.
 
 
23
 
24
  ElEmperador is an ORPO-based finetune derived from the Mistral-7B-v0.1 base model.
25
 
 
26
 
27
  ## Evals:
28
  BLEU:0.209
 
61
 
62
  ## Results
63
 
64
+ Firstly,ORPO is a viable RLHF algorithm to improve the performance of your models along with SFT finetuning.Secondly, it also helps in aligning the model’s outputs more closely with human preferences,
65
  leading to more user-friendly and acceptable results.
66
+