File size: 1,896 Bytes
0b6d276 dcc4a67 38fec8e 38e67fd 38fec8e 0b6d276 5aebaf1 40d5ee4 38fec8e 40d5ee4 5aebaf1 38fec8e 5aebaf1 38fec8e 5aebaf1 38fec8e 5aebaf1 38fec8e 5aebaf1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
---
language:
- en
license: apache-2.0
widget:
- text: The nodes of a computer network may include [MASK].
library_name: transformers
---
# NetBERT 📶
<img align="left" src="illustration.jpg" width="150"/>
<br><br><br>
NetBERT is a [BERT-base](https://huggingface.co./bert-base-cased) model further pre-trained on a huge corpus of computer networking text (~23Gb).
<br><br>
## Usage
You can use the raw model for masked language modeling (MLM), but it's mostly intended to be fine-tuned on a downstream task, especially one that uses the whole sentence to make decisions such as text classification, extractive question answering, or semantic search.
You can use this model directly with a pipeline for [masked language modeling](https://huggingface.co./tasks/fill-mask):
```python
from transformers import pipeline
unmasker = pipeline('fill-mask', model='antoinelouis/netbert')
unmasker("The nodes of a computer network may include [MASK].")
```
You can also use this model to [extract the features](https://huggingface.co./tasks/feature-extraction) of a given text:
```python
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained('antoinelouis/netbert')
model = AutoModel.from_pretrained('antoinelouis/netbert')
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
```
## Documentation
Detailed documentation on the pre-trained model, its implementation, and the data can be found on [Github](https://github.com/antoiloui/netbert/blob/master/docs/index.md).
## Citation
For attribution in academic contexts, please cite this work as:
```
@mastersthesis{louis2020netbert,
title={NetBERT: A Pre-trained Language Representation Model for Computer Networking},
author={Louis, Antoine},
year={2020},
school={University of Liege}
}
``` |