chujiezheng commited on
Commit
3fcaa9f
1 Parent(s): d611e86

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -10,7 +10,7 @@ The extrapolated (ExPO) model based on [`princeton-nlp/Llama-3-Instruct-8B-SimPO
10
 
11
  Specifically, we obtain this model by extrapolating **(alpha = 0.3)** from the weights of the SFT and DPO/RLHF checkpoints, achieving superior alignment with human preference.
12
 
13
- This model achieves the **40.6%** win rate and **45.8%** LC win rate on **AlpacaEval 2.0**.
14
 
15
  ## Evaluation Results
16
 
 
10
 
11
  Specifically, we obtain this model by extrapolating **(alpha = 0.3)** from the weights of the SFT and DPO/RLHF checkpoints, achieving superior alignment with human preference.
12
 
13
+ This extrapolated model achieves the **40.6%** win rate and **45.8%** LC win rate on **AlpacaEval 2.0**, outperforming the original `Llama-3-Instruct-8B-SimPO`'s 40.5% and 44.7%, respectively.
14
 
15
  ## Evaluation Results
16