chrispreemo commited on
Commit
6188e34
1 Parent(s): cb97e31

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -62,13 +62,13 @@ Supervised fine-tuning (SFT) and direct preference optimization (DPO)[3] further
62
  | Category | # Tokens (1Ms) | % of Total |
63
  | --- | --- | --- |
64
  | Chat (e.g. [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k)) | 640 | 45.2 |
65
- | Alignment (e.g. [orca_dpo](https://huggingface.co/datasets/Intel/orca_dpo_pairs)) | 331 | 23.4 |
66
- | Math (e.g. Goat[4]) | 300 | 21.2 |
67
  | Tabular * | 68 | 4.8 |
68
  | Summarization (e.g. [legal_summarization](https://huggingface.co/datasets/lighteval/legal_summarization)) | 52 | 3.7 |
69
  | Open-book (e.g. [selfrag](https://huggingface.co/datasets/selfrag/selfrag_train_data)) | 25 | 1.8 |
70
 
71
- (*) = Proprietary
72
 
73
  [3] Rafailov, R., Sharma, A., Mitchell, E., Ermon, S., Manning, C.D. and Finn, C., 2023. Direct preference optimization: Your language model is secretly a reward model. NeurIPS.
74
 
 
62
  | Category | # Tokens (1Ms) | % of Total |
63
  | --- | --- | --- |
64
  | Chat (e.g. [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k)) | 640 | 45.2 |
65
+ | Alignment * (e.g. [orca_dpo](https://huggingface.co/datasets/Intel/orca_dpo_pairs)) | 331 | 23.4 |
66
+ | Math * (e.g. Goat[4]) | 300 | 21.2 |
67
  | Tabular * | 68 | 4.8 |
68
  | Summarization (e.g. [legal_summarization](https://huggingface.co/datasets/lighteval/legal_summarization)) | 52 | 3.7 |
69
  | Open-book (e.g. [selfrag](https://huggingface.co/datasets/selfrag/selfrag_train_data)) | 25 | 1.8 |
70
 
71
+ (*) = Proprietary or includes proprietary data sets
72
 
73
  [3] Rafailov, R., Sharma, A., Mitchell, E., Ermon, S., Manning, C.D. and Finn, C., 2023. Direct preference optimization: Your language model is secretly a reward model. NeurIPS.
74