Get the latest tech news
Show HN: Zerox – Document OCR with GPT-mini
Zero shot pdf OCR with gpt-4o-mini. Contribute to getomni-ai/zerox development by creating an account on GitHub.
Pass in a PDF (URL or file buffer) Turn the PDF into a series of images Pass each image to GPT and ask nicely for Markdown Aggregate the responses and return Markdown ServiceCostAccuracyTable QualityAWS Textract $1.50 / 1,000 pagesLowLowGoogle Document AI[2]$1.50 / 1,000 pagesLowLowAzure Document AI[3]$1.50 / 1,000 pagesMidLowUnstructured (PDF) $10.00 / 1,000 pagesMidMid-----------------------------------------------------------------Zerox (gpt-mini)$ 4.00 / 1,000 pagesHighHighZerox uses graphicsmagick and ghostscript for the pdf => image processing step. But valueable if your documents have a lot of tabular data, or frequently have tables that cross pages.
Or read this on Hacker News