MinghaoYang
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ pipeline_tag: text-classification
|
|
17 |
# INF Outcome Reward Model
|
18 |
## Introduction
|
19 |
|
20 |
-
[**INF-ORM-Llama3.1-70B**]
|
21 |
|
22 |
We did the following three things to improve the performance of our model.
|
23 |
### Data Pre-processing
|
|
|
17 |
# INF Outcome Reward Model
|
18 |
## Introduction
|
19 |
|
20 |
+
[**INF-ORM-Llama3.1-70B**] is the outcome reward model roughly built on the [Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) architecture and trained with the dataset [INF-ORM-Preference-Magnitude-80K](https://huggingface.co/datasets/infly/INF-ORM-Preference-Magnitude-80K).
|
21 |
|
22 |
We did the following three things to improve the performance of our model.
|
23 |
### Data Pre-processing
|