Model: RoBERTa Large
Lang: IT
Model description
This is a RoBERTa Large [1] model for the Italian language, obtained using XLM-RoBERTa-Large [2] (xlm-roberta-large) as a starting point and focusing it on the italian language by modifying the embedding layer (as in [3], computing document-level frequencies over the Wikipedia dataset)
The resulting model has 356M parameters, a vocabulary of 50.670 tokens, and a size of ~1.42 GB.
Quick usage
from transformers import RobertaTokenizerFast, RobertaForMaskedLM
from transformers import pipeline
tokenizer = RobertaTokenizerFast.from_pretrained("osiria/roberta-large-italian")
model = RobertaForMaskedLM.from_pretrained("osiria/roberta-large-italian")
pipe = pipeline("fill-mask", model=model, tokenizer=tokenizer)
pipe("Milano è una <mask> italiana")
[{'score': 0.9284337759017944,
'token': 7786,
'token_str': 'città',
'sequence': 'Milano è una città italiana'},
{'score': 0.03296631574630737,
'token': 26960,
'token_str': 'capitale',
'sequence': 'Milano è una capitale italiana'},
{'score': 0.015821034088730812,
'token': 8043,
'token_str': 'provincia',
'sequence': 'Milano è una provincia italiana'},
{'score': 0.007335659582167864,
'token': 18841,
'token_str': 'regione',
'sequence': 'Milano è una regione italiana'},
{'score': 0.006183209829032421,
'token': 50152,
'token_str': 'cittadina',
'sequence': 'Milano è una cittadina italiana'}]
References
[1] https://arxiv.org/abs/1907.11692
[2] https://arxiv.org/abs/1911.02116
[3] https://arxiv.org/abs/2010.05609
License
The model is released under MIT license
- Downloads last month
- 28
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.