bhavinjawade
commited on
Commit
•
65efbe3
1
Parent(s):
f71d0bd
Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,8 @@ datasets:
|
|
7 |
## SOLAR-10B-OrcaDPO-Jawade
|
8 |
|
9 |
### Overview
|
10 |
-
This model card is instruction finetuned version of `upstage/SOLAR-10.7B-Instruct-v1.0` model. Trained on the Intel DPO Orca dataset using LoRA.
|
|
|
11 |
|
12 |
## How to Use This Model
|
13 |
|
|
|
7 |
## SOLAR-10B-OrcaDPO-Jawade
|
8 |
|
9 |
### Overview
|
10 |
+
This model card is instruction finetuned version of `upstage/SOLAR-10.7B-Instruct-v1.0` model. Trained on the Intel DPO Orca dataset using LoRA. Though it should be noted SOLAR-10.7B paper states that the
|
11 |
+
original model for alignment was trained on Intel ORCA DPO pairs. Retraining using DPO and LoRA shows slight (<1%) improvement on OpenLLM Leaderboard benchmarks against `SOLAR 10.7B-Instruct` and significant over `SOLAR 10.7B`
|
12 |
|
13 |
## How to Use This Model
|
14 |
|