Get the latest tech news
Open-source tool helps you convert PDF documents, web pages, etc., into Markdown
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。 - opendatalab/MinerU
We focus on solving symbol conversion issues in scientific literature and hope to contribute to technological development in the era of large models. In non-mainline environments, due to the diversity of hardware and software configurations, as well as third-party dependency compatibility issues, we cannot guarantee 100% project availability. Below are some performance test results in an Ubuntu 22.04 LTS + Intel(R) Xeon(R) Platinum 8352V CPU @ 2.10GHz + NVIDIA GeForce RTX 4090 environment for reference.
Or read this on Hacker News