license: apache-2.0 | |
datasets: | |
- EleutherAI/pile | |
language: | |
- en | |
tags: | |
- tokenizer | |
A copy of Eleuther AI's [gpt-neox-20b](https://huggingface.co./EleutherAI/gpt-neox-20b), with three special tokens added to mask PII: | |
- `|||EMAIL_ADDRESS|||` | |
- `|||PHONE_NUMBER|||` | |
- `|||IP_ADDRESS|||` |