Tesseract vs. GOT-OCR2_0: Which Performs Better for Text Extraction from Images?

#17
by bubbleMilkTea - opened

I'm curious about the differences between Tesseract and GOT-OCR2_0. Which one performs better?
My main goal is to convert an image file into general Markdown format. Do you recommend using GOT-OCR2_0's plain text OCR to extract text and then applying markdownify, or using GOT-OCR2_0's formatted text OCR to extract Mathpix Markdown and convert it to general Markdown? Which approach would be more efficient, and which would you recommend?

Sign up or log in to comment