Get the latest tech news

Show HN: OCR pipeline for ML training (tables, diagrams, math, multilingual)


Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams) - ses4255/Versatile-OCR-Program

This OCR system is specifically designed to extract structured data from complex educational materials—such as exam papers—in a format optimized for machine learning (ML) training. This includes automatic generation of natural language descriptions for visual content (e.g., “This figure shows the process of mitosis in four stages”) to enhance downstream model training. Below are actual examples of outputs generated by this system using real-world materials (2017 EJU Biology & 2014 University of Tokyo Math), including English-translated semantic context and extracted data.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of tables

tables

Photo of math

math

Photo of diagrams

diagrams

Related news:

News photo

Here's how Trump calculated tariffs. The math has baffled economists

News photo

Show HN: Mermaid Chart VS Code Plugin: Mermaid.js Diagrams in Visual Studio Code

News photo

Accessible open textbooks in math-heavy disciplines