Update README.md
Browse files
README.md
CHANGED
@@ -19,6 +19,7 @@ The Skywork preference dataset demonstrates that a small high-quality dataset ca
|
|
19 |
## Evaluation
|
20 |
We evaluate Gemma-2B-rewardmodel-ft on the [reward model benchmark](https://huggingface.co/spaces/allenai/reward-bench), where it achieves a score of 80.5.
|
21 |
|
|
|
22 |
|
23 |
| Model | Average | Chat | Chat Hard | Safety | Reasoning |
|
24 |
|:-------------------------:|:-------------:|:---------:|:---------:|:--------:|:-----------:|
|
|
|
19 |
## Evaluation
|
20 |
We evaluate Gemma-2B-rewardmodel-ft on the [reward model benchmark](https://huggingface.co/spaces/allenai/reward-bench), where it achieves a score of 80.5.
|
21 |
|
22 |
+
**When evaluated using reward bench, please add '--not_quantized' to avoid performance drop.**
|
23 |
|
24 |
| Model | Average | Chat | Chat Hard | Safety | Reasoning |
|
25 |
|:-------------------------:|:-------------:|:---------:|:---------:|:--------:|:-----------:|
|