piotr-rybak's picture
Update README.md
3757a85
|
raw
history blame
1.34 kB
metadata
license: cc-by-sa-4.0
pipeline_tag: fill-mask

Model Card for Silesian HerBERT Base

Silesian HerBERT Base is a HerBERT Base model with a Silesian tokenizer and fine-tuned on Silesian Wikipedia.

Usage

Example code:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("ipipan/silesian-herbert-base")
model = AutoModel.from_pretrained("ipipan/silesian-herbert-base")

output = model(
    **tokenizer.batch_encode_plus(
        [
            (
                "Wielgŏ Piyramida we Gizie, mianowanŏ tyż Piyramida ôd Cheopsa, to je nojsrogszŏ a nojbarzij znanŏ ze egipskich piyramid we Gizie.",
            )
        ],
    padding='longest',
    add_special_tokens=True,
    return_tensors='pt'
    )
)

License

CC BY-SA 4.0

Citation

If you use this model, please cite the following paper:


Authors

The model was created by Piotr Rybak from Linguistic Engineering Group at Institute of Computer Science, Polish Academy of Sciences.

This work was supported by the European Regional Development Fund as a part of 2014–2020 Smart Growth Operational Programme, CLARIN — Common Language Resources and Technology Infrastructure, project no. POIR.04.02.00-00C002/19.