language:
- sl
license: cc-by-sa-4.0
T5-incorrect-word-spelling-corrector
This T5 model is designed to identify and correct words with incorrect spelling in the Slovenian language.
Model Output Example
Consider the following Slovenian text:
Model v besedlu popravi napaake v nepravilno črkovanih besedah.
The model might return the following text (note: predictions chosen for demonstration/explanation, not reproducibility!):
Model v besedilu popravi napake v nepravilno črkovanih besedah.
We observe that in the input sentence, the words besedlu
and napaake
are incorrectly spelled, so the model corrects them to besedilu
and napake
.
More details
Testing the model with generated test sets provides the following result (combining detection and correction of words with incorrect spelling):
Precission
: 0,986Recall
: 0,935F1
: 0,960
Testing the model, in combination with cjvt/SloBERTa-slo-word-spelling-annotator, with test sets constructed using the Šolar Eval dataset provides the following results (combining detection and correction of words with incorrect spelling):
Precission
: 0,823Recall
: 0,796F1
: 0,810
Acknowledgement
The authors acknowledge the financial support from the Slovenian Research and Innovation Agency - research core funding No. P6-0411: Language Resources and Technologies for Slovene and research project No. J7-3159: Empirical foundations for digitally-supported development of writing skills.
Authors
Thanks to Martin Božič, Marko Robnik-Šikonja and Špela Arhar Holdt for developing these models.
- Downloads last month
- 26