proofGPT-v0.1 / tokenizer_config.json
zhangirazerbayev's picture
Modified tokenizer_config.json, now includes <|endoftext|> token
9695b51
raw
history blame
213 Bytes
{
"add_prefix_space": false,
"bos_token": "<|endoftext|>",
"eos_token": "<|endoftext|>",
"name_or_path": "EleutherAI/gpt-neox-20b",
"tokenizer_class": "GPTNeoXTokenizer",
"unk_token": "<|endoftext|>"
}