A newer version of the Streamlit SDK is available:
1.42.2
metadata
title: Ocr
emoji: π
colorFrom: blue
colorTo: pink
sdk: streamlit
sdk_version: 1.38.0
app_file: app.py
pinned: false
Check out the configuration reference at https://huggingface.co./docs/hub/spaces-config-reference
Libraries Required
streamlit torch transformers Pillow easyocr huggingface_hub requests re
Setup & Installation
- Python 3.8+
- pip (Python package installer)
Features
- Dual Language OCR: Supports text extraction from images containing both Hindi and English text.
- Fallback System: Automatically switches to EasyOCR if the GOT model fails.
- Image Upload: Users can upload image files (JPEG, PNG) for OCR processing.
- Text Search: Allows users to search for keywords within the extracted text and highlights the matching results.
- Simple UI: Easy-to-use interface for non-technical users.
- Evaluation Criteria
Accuracy
- The application extracts text from both Hindi and English sections of the image with high accuracy, using either GOT-OCR2_0 or EasyOCR.
Functionality
- Handles image uploads (PNG, JPEG).
- Extracts text from images and displays it in plain text format.
- Allows users to search for keywords within the extracted text, with the matching results highlighted.
- User Interface
- A simple, intuitive interface built with Streamlit.
- Provides clear instructions and feedback (e.g., displaying progress during OCR, error handling).
- Responsive layout to fit different screen sizes.
Deployment
- The application can be deployed online via a cloud service like Streamlit Cloud, Heroku, or any other hosting platform that supports Python applications.
Clarity
- Well-structured and easy-to-follow code.
- Clear comments and logical flow for each function.
- User-friendly interface with concise instructions for end-users.
Completeness
- All functionalities are implemented and operational.
- The application works as expected for Hindi and English OCR and includes keyword search.
- Prerequisites
cd got-ocr-app
streamlit run app.py