ocr / README.md
yashnd's picture
Update README.md
d65b071 verified

A newer version of the Streamlit SDK is available: 1.42.2

Upgrade
metadata
title: Ocr
emoji: πŸ‘€
colorFrom: blue
colorTo: pink
sdk: streamlit
sdk_version: 1.38.0
app_file: app.py
pinned: false

Check out the configuration reference at https://huggingface.co./docs/hub/spaces-config-reference

Libraries Required

streamlit torch transformers Pillow easyocr huggingface_hub requests re

Setup & Installation

  • Python 3.8+
  • pip (Python package installer)

Features

  • Dual Language OCR: Supports text extraction from images containing both Hindi and English text.
  • Fallback System: Automatically switches to EasyOCR if the GOT model fails.
  • Image Upload: Users can upload image files (JPEG, PNG) for OCR processing.
  • Text Search: Allows users to search for keywords within the extracted text and highlights the matching results.
  • Simple UI: Easy-to-use interface for non-technical users.
  • Evaluation Criteria

Accuracy

  • The application extracts text from both Hindi and English sections of the image with high accuracy, using either GOT-OCR2_0 or EasyOCR.

Functionality

  • Handles image uploads (PNG, JPEG).
  • Extracts text from images and displays it in plain text format.
  • Allows users to search for keywords within the extracted text, with the matching results highlighted.
  • User Interface
  • A simple, intuitive interface built with Streamlit.
  • Provides clear instructions and feedback (e.g., displaying progress during OCR, error handling).
  • Responsive layout to fit different screen sizes.

Deployment

  • The application can be deployed online via a cloud service like Streamlit Cloud, Heroku, or any other hosting platform that supports Python applications.

Clarity

  • Well-structured and easy-to-follow code.
  • Clear comments and logical flow for each function.
  • User-friendly interface with concise instructions for end-users.

Completeness

  • All functionalities are implemented and operational.
  • The application works as expected for Hindi and English OCR and includes keyword search.
  • Prerequisites

cd got-ocr-app

streamlit run app.py