weqweasdas commited on
Commit
be26d01
1 Parent(s): db1b9fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -87,7 +87,9 @@ We collect the existing preference datasets and use them as a benchmark to evalu
87
 
88
  | Model/Test set | HH-RLHF-Helpful | SHP | Helpsteer helpful + correctness | Helpsteer All | MT Bench Human | MT Bench GPT4 | Alpaca Human | Alpaca GPT4| Alpca Human-crossed|
89
  | -------------- | -------------- | ------- | ------- | ------- | ------- | ------- | ------- |------- | ------- |
90
- | open assistant | **0.68** | 0.73 | 0.68 | 0.72 |0.77 | 0.87 | 0.63 | 0.78 | 0.59 |
 
 
91
 
92
 
93
 
 
87
 
88
  | Model/Test set | HH-RLHF-Helpful | SHP | Helpsteer helpful + correctness | Helpsteer All | MT Bench Human | MT Bench GPT4 | Alpaca Human | Alpaca GPT4| Alpca Human-crossed|
89
  | -------------- | -------------- | ------- | ------- | ------- | ------- | ------- | ------- |------- | ------- |
90
+ | RM-Gemma-2B | 0.68 | 0.73 | 0.68 | 0.72 |0.77 | 0.87 | 0.63 | 0.78 | 0.59 |
91
+
92
+
93
 
94
 
95