|
--- |
|
tags: |
|
- donut |
|
- image-to-text |
|
- invoices |
|
--- |
|
# Overview |
|
This repository contains a fine-tuned version of the Donut model for document understanding, specifically tailored for invoice processing. The Donut model is based on the OCR-free Document Understanding Transformer, introduced in the paper by Geewok et al. OCR-free Document Understanding Transformer, and initially released in the repository https://github.com/clovaai/donut. |
|
|
|
The purpose of this custom fine-tuning is to enhance the Donut model's performance specifically for invoice analysis and extraction. The model was trained on a custom dataset of annotated invoices, comprising several hundred examples. Although the dataset is not included in this repository, details on its availability will be provided later. |
|
|
|
# Model Details |
|
|
|
The Donut model is a transformer-based architecture that leverages self-attention mechanisms for document understanding. By fine-tuning the model with a custom dataset of invoices, we aim to improve its ability to accurately extract relevant information from invoices, such as vendor details, billing information, line items, and totals. |
|
|
|
[Demo can be found here](https://colab.research.google.com/drive/1zDvSysp24bCk60LR6172Z94eY1mRhKWF#scrollTo=f7RoSOEXUa6i) |
|
|