cjvt
/


language:

  • sl

license: cc-by-sa-4.0

T5-slo-word-order-corrector

This T5 model is designed to correct the word order inside sentence sections. Sentences are split into sections based on commas and conjunctions.

Model Output Example

Imagine we have the following Slovenian text:

Popravi model besedilo, v katerem vrstni je red nekaterih besed napačen.

The model might return the following text (note: predictions chosen for demonstration/explanation, not reproducibility!):

Model popravi besedilo, v katerem je vrstni red nekaterih besed napačen.

We observe that in the input sentence, the sentence sections Popravi model besedilo and v katerem vrstni je red nekaterih besed napačen are written with incorrect word order, so our model corrects word order of words Popravi model and je vrstni.

More details

Testing the model with generated test sets provides the following result (combining detection and correction of words with incorrect word order):

  • Precission: 0,937
  • Recall: 0,869
  • F1: 0,902

Acknowledgement

The authors acknowledge the financial support from the Slovenian Research and Innovation Agency - research core funding No. P6-0411: Language Resources and Technologies for Slovene and research project No. J7-3159: Empirical foundations for digitally-supported development of writing skills.

Authors

Thanks to Martin Božič, Marko Robnik-Šikonja and Špela Arhar Holdt for developing these models.

Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train cjvt/t5-slo-word-order-corrector