FERNET-C5

FERNET-C5 (Flexible Embedding Representation NETwork) is a monolingual Czech BERT-base model pre-trained from 93GB of Czech Colossal Clean Crawled Corpus (C5). See our paper for details.

Paper

https://link.springer.com/chapter/10.1007/978-3-030-89579-2_3

The preprint of our paper is available at https://arxiv.org/abs/2107.10042.

Citation

If you find this model useful, please cite our paper:

@inproceedings{FERNETC5,
    title        = {Comparison of Czech Transformers on Text Classification Tasks},
    author       = {Lehe{\v{c}}ka, Jan and {\v{S}}vec, Jan},
    year         = 2021,
    booktitle    = {Statistical Language and Speech Processing},
    publisher    = {Springer International Publishing},
    address      = {Cham},
    pages        = {27--37},
    doi          = {10.1007/978-3-030-89579-2_3},
    isbn         = {978-3-030-89579-2},
    editor       = {Espinosa-Anke, Luis and Mart{\'i}n-Vide, Carlos and Spasi{\'{c}}, Irena}
}
Downloads last month
3,145
Safetensors
Model size
164M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.