Get the latest tech news

Open-source tool helps you convert PDF documents, web pages, etc., into Markdown


A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。 - opendatalab/MinerU

We focus on solving symbol conversion issues in scientific literature and hope to contribute to technological development in the era of large models. In non-mainline environments, due to the diversity of hardware and software configurations, as well as third-party dependency compatibility issues, we cannot guarantee 100% project availability. Below are some performance test results in an Ubuntu 22.04 LTS + Intel(R) Xeon(R) Platinum 8352V CPU @ 2.10GHz + NVIDIA GeForce RTX 4090 environment for reference.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of PDF

PDF

Photo of Markdown

Markdown

Photo of source tool

source tool

Related news:

News photo

Hacking with PDF (2022)

News photo

Microsoft Edge PDF reader is getting more Copilot AI features

News photo

Markdown is meant to be shown (2021)