To improve the code-generation/code-completion ability, I wanna do a continue-pre-training on this instructed version model, how should I make my pre-training data? Just add "<|endoftext|>" token at the end of each chunk of code-text?
· Sign up or log in to comment