lqtrung1998
commited on
Commit
•
414d3a8
1
Parent(s):
7ee426b
Update README.md
Browse files
README.md
CHANGED
@@ -10,10 +10,10 @@ Repo: https://github.com/lqtrung1998/mwp_ReFT (under [Apache2.0 License](https:/
|
|
10 |
We introduce REinforced Fine-tuning (ReFT), a method that enhances the generalizability of learning LLMs for reasoning.
|
11 |
|
12 |
This repository contains:
|
13 |
-
- A Supervised Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-
|
14 |
-
- A Warmup Supervised Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-
|
15 |
-
- A REinforced Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-
|
16 |
-
- A Rerank model that can score the fine-tuned model output: [lqtrung1998/galactica-6.7b-
|
17 |
|
18 |
Note: Our models are tuned based on Galactica, thus, licenses applicable to Galactica, such as non-commercial CC BY-NC 4.0 license also hold on these models.
|
19 |
|
|
|
10 |
We introduce REinforced Fine-tuning (ReFT), a method that enhances the generalizability of learning LLMs for reasoning.
|
11 |
|
12 |
This repository contains:
|
13 |
+
- A Supervised Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-SFT-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-SFT-GSM8k)
|
14 |
+
- A Warmup Supervised Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-SFT-warmup-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-SFT-warmup-GSM8k)
|
15 |
+
- A REinforced Fine-tuned model on GSM8k benchmark: [lqtrung1998/galactica-6.7b-ReFT-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-ReFT-GSM8k)
|
16 |
+
- A Rerank model that can score the fine-tuned model output: [lqtrung1998/galactica-6.7b-ReFT-Rerank-GSM8k](https://huggingface.co/lqtrung1998/galactica-6.7b-ReFT-Rerank-GSM8k)
|
17 |
|
18 |
Note: Our models are tuned based on Galactica, thus, licenses applicable to Galactica, such as non-commercial CC BY-NC 4.0 license also hold on these models.
|
19 |
|