LayoutLMv2 Model Fine-tuned with CIVQA (Tesseract) dataset

This is a fine-tuned version of the LayoutLMv2 model, which was trained on Czech Invoice Visual Question Answering (CIVQA) dataset containing invoices in the Czech language as well as on the Data Visualizations via Question Answering ([DVQA] (https://paperswithcode.com/dataset/dvqa)) dataset.

This model enables Document Visual Question Answering on Czech invoices with the use of the existing DVQA dataset.

Regarding the Czech invoices, we focused on 10 different entities, which are crucial for processing the invoices.

Variable symbol
Specific symbol
Constant symbol
Bank code
Account number
Total amount
Invoice date
Name of supplier
DIC
QR code

You can find more information about this model in this paper.