Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,26 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
4 |
+
|
5 |
+
# Phi-2 Orange
|
6 |
+
|
7 |
+
A two-step finetune of Phi-2.
|
8 |
+
|
9 |
+
First using a collection of broad training data:
|
10 |
+
|
11 |
+
- [Open-Orca/SlimOrca-Dedup](https://huggingface.co/datasets/Open-Orca/SlimOrca-Dedup)
|
12 |
+
- [migtissera/Synthia-v1.3](https://huggingface.co/datasets/migtissera/Synthia-v1.3)
|
13 |
+
- [LDJnr/Verified-Camel](https://huggingface.co/datasets/LDJnr/Verified-Camel)
|
14 |
+
- [LDJnr/Pure-Dove](https://huggingface.co/datasets/LDJnr/Pure-Dove)
|
15 |
+
- [LDJnr/Capybara](https://huggingface.co/datasets/LDJnr/Capybara)
|
16 |
+
- [meta-math/MetaMathQA](https://huggingface.co/datasets/meta-math/MetaMathQA)
|
17 |
+
|
18 |
+
And then a DPO finetune using:
|
19 |
+
|
20 |
+
- [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs)
|
21 |
+
- [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned)
|
22 |
+
|
23 |
+
# Initial Evals
|
24 |
+
|
25 |
+
- ARC: 62.29
|
26 |
+
- TruthfulQA: 49.85
|