Tesseract vs. GOT-OCR2_0: Which Performs Better for Text Extraction from Images?

#17

by bubbleMilkTea - opened Sep 25

Discussion

bubbleMilkTea

Sep 25

•

edited Sep 25

I'm curious about the differences between Tesseract and GOT-OCR2_0. Which one performs better?
My main goal is to convert an image file into general Markdown format. Do you recommend using GOT-OCR2_0's plain text OCR to extract text and then applying markdownify, or using GOT-OCR2_0's formatted text OCR to extract Mathpix Markdown and convert it to general Markdown? Which approach would be more efficient, and which would you recommend?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment