chrispreemo
commited on
Commit
•
6188e34
1
Parent(s):
cb97e31
Update README.md
Browse files
README.md
CHANGED
@@ -62,13 +62,13 @@ Supervised fine-tuning (SFT) and direct preference optimization (DPO)[3] further
|
|
62 |
| Category | # Tokens (1Ms) | % of Total |
|
63 |
| --- | --- | --- |
|
64 |
| Chat (e.g. [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k)) | 640 | 45.2 |
|
65 |
-
| Alignment (e.g. [orca_dpo](https://huggingface.co/datasets/Intel/orca_dpo_pairs)) | 331 | 23.4 |
|
66 |
-
| Math (e.g. Goat[4]) | 300 | 21.2 |
|
67 |
| Tabular * | 68 | 4.8 |
|
68 |
| Summarization (e.g. [legal_summarization](https://huggingface.co/datasets/lighteval/legal_summarization)) | 52 | 3.7 |
|
69 |
| Open-book (e.g. [selfrag](https://huggingface.co/datasets/selfrag/selfrag_train_data)) | 25 | 1.8 |
|
70 |
|
71 |
-
(*) = Proprietary
|
72 |
|
73 |
[3] Rafailov, R., Sharma, A., Mitchell, E., Ermon, S., Manning, C.D. and Finn, C., 2023. Direct preference optimization: Your language model is secretly a reward model. NeurIPS.
|
74 |
|
|
|
62 |
| Category | # Tokens (1Ms) | % of Total |
|
63 |
| --- | --- | --- |
|
64 |
| Chat (e.g. [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k)) | 640 | 45.2 |
|
65 |
+
| Alignment * (e.g. [orca_dpo](https://huggingface.co/datasets/Intel/orca_dpo_pairs)) | 331 | 23.4 |
|
66 |
+
| Math * (e.g. Goat[4]) | 300 | 21.2 |
|
67 |
| Tabular * | 68 | 4.8 |
|
68 |
| Summarization (e.g. [legal_summarization](https://huggingface.co/datasets/lighteval/legal_summarization)) | 52 | 3.7 |
|
69 |
| Open-book (e.g. [selfrag](https://huggingface.co/datasets/selfrag/selfrag_train_data)) | 25 | 1.8 |
|
70 |
|
71 |
+
(*) = Proprietary or includes proprietary data sets
|
72 |
|
73 |
[3] Rafailov, R., Sharma, A., Mitchell, E., Ermon, S., Manning, C.D. and Finn, C., 2023. Direct preference optimization: Your language model is secretly a reward model. NeurIPS.
|
74 |
|