Transformer language model for Croatian and Serbian

Trained on 3GB datasets that contain Croatian and Serbian language for two epochs. Leipzig and OSCAR datasets

Information of dataset

Model #params Arch. Training data
Andrija/SRoBERTa-base 80M Second Leipzig Corpus and OSCAR (3 GB of text)
Downloads last month
182
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train Andrija/SRoBERTa-base