bhavinjawade
/

SOLAR-10B-OrcaDPO-Jawade

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

bhavinjawade commited on Jan 9

Commit

65efbe3

•

1 Parent(s): f71d0bd

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -7,7 +7,8 @@ datasets:
 ## SOLAR-10B-OrcaDPO-Jawade
 ### Overview
-This model card is instruction finetuned version of `upstage/SOLAR-10.7B-Instruct-v1.0` model. Trained on the Intel DPO Orca dataset using LoRA.
 ## How to Use This Model

 ## SOLAR-10B-OrcaDPO-Jawade
 ### Overview
+This model card is instruction finetuned version of `upstage/SOLAR-10.7B-Instruct-v1.0` model. Trained on the Intel DPO Orca dataset using LoRA. Though it should be noted SOLAR-10.7B paper states that the
+original model for alignment was trained on Intel ORCA DPO pairs. Retraining using DPO and LoRA shows slight (<1%) improvement on OpenLLM Leaderboard benchmarks against `SOLAR 10.7B-Instruct` and significant over `SOLAR 10.7B`
 ## How to Use This Model