Loubna ben allal
add architecture info
d97087d
raw
history blame
285 Bytes
[OPT](https://huggingface.co./facebook/opt-30b) uses decoder-only models like GPT-3. It was trained on datasets with a small portion of code. In this demo we use the 30B parameters model. The largest model has 176B parameters.
|Model | # parameters |
| - | - |
| Decoder |30B |