UTRNet: High-Resolution Urdu Text Recognition In Printed Documents
Abstract
In this paper, we propose a novel approach to address the challenges of printed Urdu text recognition using <PRE_TAG><PRE_TAG>high-resolution</POST_TAG></POST_TAG>, multi-scale semantic feature extraction. Our proposed <PRE_TAG><PRE_TAG><PRE_TAG>UTRNet</POST_TAG></POST_TAG></POST_TAG> architecture, a hybrid <PRE_TAG><PRE_TAG><PRE_TAG>CNN-RNN</POST_TAG></POST_TAG></POST_TAG> model, demonstrates state-of-the-art performance on benchmark datasets. To address the limitations of previous works, which struggle to generalize to the intricacies of the Urdu script and the lack of sufficient annotated real-world data, we have introduced the <PRE_TAG><PRE_TAG><PRE_TAG>UTRSet-Real</POST_TAG></POST_TAG></POST_TAG>, a large-scale annotated real-world dataset comprising over 11,000 lines and <PRE_TAG><PRE_TAG><PRE_TAG>UTRSet-Synth</POST_TAG></POST_TAG></POST_TAG>, a synthetic dataset with 20,000 lines closely resembling real-world and made corrections to the ground truth of the existing IIITH dataset, making it a more reliable resource for future research. We also provide <PRE_TAG><PRE_TAG>UrduDoc</POST_TAG></POST_TAG>, a benchmark dataset for Urdu text line detection in scanned documents. Additionally, we have developed an online tool for end-to-end <PRE_TAG><PRE_TAG><PRE_TAG>Urdu OCR</POST_TAG></POST_TAG></POST_TAG> from printed documents by integrating <PRE_TAG><PRE_TAG><PRE_TAG>UTRNet</POST_TAG></POST_TAG></POST_TAG> with a <PRE_TAG><PRE_TAG>text detection model</POST_TAG></POST_TAG>. Our work not only addresses the current limitations of <PRE_TAG><PRE_TAG>Urdu OCR</POST_TAG></POST_TAG> but also paves the way for future research in this area and facilitates the continued advancement of <PRE_TAG><PRE_TAG>Urdu OCR</POST_TAG></POST_TAG> technology. The project page with source code, datasets, annotations, trained models, and online tool is available at abdur75648.github.io/<PRE_TAG><PRE_TAG><PRE_TAG>UTRNet</POST_TAG></POST_TAG></POST_TAG>.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 3
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper