loubnabnl's picture
loubnabnl HF staff
update architecture
a864ceb
raw
history blame
1.34 kB
[CodeParrot](https://huggingface.co./lvwerra/codeparrot) uses GPT-2 architecture with BPE tokenizer trained on Python code from the training split of the data, and a context mength of 1024. We released this model as an educational tool for training large language models from scratch on code, with detailed tutorials and descriptions of the training process. It makes use of πŸ€— [`accelerate`](https://huggingface.co./docs/accelerate/index) for distributed training and mixed precision. See this [blog](https://huggingface.co./blog/codeparrot) and [repo](https://github.com/huggingface/transformers/tree/main/examples/research_projects/codeparrot) for more details.
|Model | # parameters |
| - | - |
| GPT2 | 110M |
| GPT2 | 1.5B |
You can load the model and tokenizer directly from πŸ€— [`transformers`](https://huggingface.co./docs/transformers/index):
```python
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("lvwerra/codeparrot")
model = AutoModelWithLMHead.from_pretrained("lvwerra/codeparrot")
inputs = tokenizer("def hello_world():", return_tensors="pt")
outputs = model(**inputs)
```
Or you can use a `pipeline`:
```python
from transformers import pipeline
pipe = pipeline("text-generation", model="lvwerra/codeparrot")
outputs = pipe("def hello_world():")
```