BramVanroy
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -69,7 +69,7 @@ The training set (`train_sft`) consists of 240,527,565 tokens (calculated prior
|
|
69 |
Here is a break down of the training set (some data pages might not be available yet *but they definitely will be in the near future*).
|
70 |
|
71 |
- [BramVanroy/ultrachat_200k_dutch](https://huggingface.co/datasets/BramVanroy/ultrachat_200k_dutch) (gpt-4-turbo; multi-turn; generated): 85.42%
|
72 |
-
- [BramVanroy/no_robots_dutch](https://huggingface.co/datasets/BramVanroy/no_robots_dutch) (gpt-4-turbo; prompt translate, answer generated): 2.20%
|
73 |
- [BramVanroy/stackoverflow-chat-dutch](https://huggingface.co/datasets/BramVanroy/stackoverflow-chat-dutch) (gpt-3.5-turbo; multi-turn; code; translated; only 50% used): 8.38%
|
74 |
- [BramVanroy/alpaca-cleaned-dutch](https://huggingface.co/datasets/BramVanroy/alpaca-cleaned-dutch) (gpt-3.5-turbo; translated): 2.62%
|
75 |
- [BramVanroy/dolly-15k-dutch](https://huggingface.co/datasets/BramVanroy/dolly-15k-dutch) (gpt-3.5-turbo; translated): 1.39%
|
|
|
69 |
Here is a break down of the training set (some data pages might not be available yet *but they definitely will be in the near future*).
|
70 |
|
71 |
- [BramVanroy/ultrachat_200k_dutch](https://huggingface.co/datasets/BramVanroy/ultrachat_200k_dutch) (gpt-4-turbo; multi-turn; generated): 85.42%
|
72 |
+
- [BramVanroy/no_robots_dutch](https://huggingface.co/datasets/BramVanroy/no_robots_dutch) (gpt-4-turbo; prompt translate, answer generated; some items have system messages): 2.20%
|
73 |
- [BramVanroy/stackoverflow-chat-dutch](https://huggingface.co/datasets/BramVanroy/stackoverflow-chat-dutch) (gpt-3.5-turbo; multi-turn; code; translated; only 50% used): 8.38%
|
74 |
- [BramVanroy/alpaca-cleaned-dutch](https://huggingface.co/datasets/BramVanroy/alpaca-cleaned-dutch) (gpt-3.5-turbo; translated): 2.62%
|
75 |
- [BramVanroy/dolly-15k-dutch](https://huggingface.co/datasets/BramVanroy/dolly-15k-dutch) (gpt-3.5-turbo; translated): 1.39%
|