metadata
license: apache-2.0
datasets:
- EleutherAI/pile
language:
- en
tags:
- tokenizer
A copy of Eleuther AI's gpt-neox-20b, with three special tokens added to mask PII:
|||EMAIL_ADDRESS|||
|||PHONE_NUMBER|||
|||IP_ADDRESS|||
license: apache-2.0
datasets:
- EleutherAI/pile
language:
- en
tags:
- tokenizer
A copy of Eleuther AI's gpt-neox-20b, with three special tokens added to mask PII:
|||EMAIL_ADDRESS|||
|||PHONE_NUMBER|||
|||IP_ADDRESS|||