AINovice2005
/

ElEmperador

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

AINovice2005 commited on 17 days ago

Commit

6ae5393

•

1 Parent(s): b9bbf9b

Update README.md

Files changed (1) hide show

README.md +9 -3

README.md CHANGED Viewed

@@ -21,16 +21,22 @@ tags:
 ElEmperador is an ORPO-based finetinue derived from the Mistral-7B-v0.1 base model.
-The argilla/ultrafeedback-binarized-preferences-cleaned dataset was used to improve the performance of the model.
 ## Citation
 [Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, May 23). ] https://arxiv.org/abs/2305.14314.
-## Bleu:0.0209
-The model recipe: https://github.com/ParagEkbote/El-Emperador_ModelRecipe
 ## Inference Script:

 ElEmperador is an ORPO-based finetinue derived from the Mistral-7B-v0.1 base model.
+The argilla/ultrafeedback-binarized-preferences-cleaned dataset was used, albeit a small portion was used due to GPU constraints.
 ## Citation
 [Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023, May 23). ] https://arxiv.org/abs/2305.14314.
+# Evals:
+BLEU:0.0209
+# Conclusion and Model Recipe.
+ORPO is a viable RLHF algorithm to improve the performance of your models than SFT finetuning. It also helps in aligning the model’s outputs more closely with human preferences,
+leading to more user-friendly and acceptable results.
+The model recipe: [ https://github.com/ParagEkbote/El-Emperador_ModelRecipe]
 ## Inference Script: