|
--- |
|
language: |
|
- en |
|
license: apache-2.0 |
|
widget: |
|
- text: The nodes of a computer network may include [MASK]. |
|
library_name: transformers |
|
--- |
|
|
|
# NetBERT 📶 |
|
|
|
<img align="left" src="illustration.jpg" width="100"/> |
|
|
|
NetBERT is a [BERT-base](https://huggingface.co./bert-base-cased) model further pre-trained on a huge corpus of computer networking text (~23Gb). |
|
|
|
## Usage |
|
|
|
You can use the raw model for masked language modeling (MLM), but it's mostly intended to be fine-tuned on a downstream task, especially one that uses the whole sentence to make decisions such as text classification, extractive question answering, or semantic search. |
|
|
|
You can use this model directly with a pipeline for [masked language modeling](https://huggingface.co./tasks/fill-mask): |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
unmasker = pipeline('fill-mask', model='antoinelouis/netbert') |
|
unmasker("The nodes of a computer network may include [MASK].") |
|
``` |
|
|
|
You can also use this model to [extract the features](https://huggingface.co./tasks/feature-extraction) of a given text: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModel |
|
|
|
tokenizer = AutoTokenizer.from_pretrained('antoinelouis/netbert') |
|
model = AutoModel.from_pretrained('antoinelouis/netbert') |
|
|
|
text = "Replace me by any text you'd like." |
|
encoded_input = tokenizer(text, return_tensors='pt') |
|
output = model(**encoded_input) |
|
``` |
|
|
|
## Documentation |
|
|
|
Detailed documentation on the pre-trained model, its implementation, and the data can be found on [Github](https://github.com/antoiloui/netbert/blob/master/docs/index.md). |
|
|
|
## Citation |
|
|
|
For attribution in academic contexts, please cite this work as: |
|
|
|
``` |
|
@mastersthesis{louis2020netbert, |
|
title={NetBERT: A Pre-trained Language Representation Model for Computer Networking}, |
|
author={Louis, Antoine}, |
|
year={2020}, |
|
school={University of Liege} |
|
} |
|
``` |