BramVanroy
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -53,12 +53,11 @@ Here is a break down of the training set (some data pages might not be available
|
|
53 |
- [BramVanroy/dolly-15k-dutch](https://huggingface.co/datasets/BramVanroy/dolly-15k-dutch) (gpt-3.5-turbo; translated): 1.39%
|
54 |
|
55 |
|
56 |
-
|
57 |
## Training procedure
|
58 |
|
59 |
The great [alignment handbook](https://github.com/huggingface/alignment-handbook/) was used for training, with a custom slurm script for compatibility with our cluster. It was trained in full, without LoRA or other adapters.
|
60 |
|
61 |
-
The model was trained in bfloat16 with flash attention 2 and a context length of 8192.
|
62 |
|
63 |
Recipe used with the handbook:
|
64 |
|
|
|
53 |
- [BramVanroy/dolly-15k-dutch](https://huggingface.co/datasets/BramVanroy/dolly-15k-dutch) (gpt-3.5-turbo; translated): 1.39%
|
54 |
|
55 |
|
|
|
56 |
## Training procedure
|
57 |
|
58 |
The great [alignment handbook](https://github.com/huggingface/alignment-handbook/) was used for training, with a custom slurm script for compatibility with our cluster. It was trained in full, without LoRA or other adapters.
|
59 |
|
60 |
+
The model was trained in bfloat16 with flash attention 2 and a context length of 8192. You can find the [wandb logs](https://wandb.ai/bramvanroy/sft-geitje-ultra) here.
|
61 |
|
62 |
Recipe used with the handbook:
|
63 |
|