AINovice2005 commited on
Commit
1bee25d
1 Parent(s): 6ae5393

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -12,7 +12,7 @@ tags:
12
 
13
  ---
14
 
15
- <h1 style="font-size: 2em;">Presenting ElEmperador.</h1>
16
 
17
 
18
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e8ea3892d9db9a93580fe3/gkDcpIxRCjBlmknN_jzWN.png)
@@ -29,11 +29,11 @@ The argilla/ultrafeedback-binarized-preferences-cleaned dataset was used, albeit
29
 
30
 
31
  # Evals:
32
- BLEU:0.0209
33
 
34
  # Conclusion and Model Recipe.
35
- ORPO is a viable RLHF algorithm to improve the performance of your models than SFT finetuning. It also helps in aligning the model’s outputs more closely with human preferences,
36
 
 
37
  leading to more user-friendly and acceptable results.
38
 
39
  The model recipe: [ https://github.com/ParagEkbote/El-Emperador_ModelRecipe]
 
12
 
13
  ---
14
 
15
+ <h1 style="font-size: 2em;">ElEmperador.</h1>
16
 
17
 
18
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64e8ea3892d9db9a93580fe3/gkDcpIxRCjBlmknN_jzWN.png)
 
29
 
30
 
31
  # Evals:
32
+ BLEU:0.209
33
 
34
  # Conclusion and Model Recipe.
 
35
 
36
+ ORPO is a viable RLHF algorithm to improve the performance of your models than SFT finetuning. It also helps in aligning the model’s outputs more closely with human preferences,
37
  leading to more user-friendly and acceptable results.
38
 
39
  The model recipe: [ https://github.com/ParagEkbote/El-Emperador_ModelRecipe]