Get the latest tech news

Chunkr – Vision model based PDF chunking


Vision model based PDF chunking. . Contribute to lumina-ai-inc/chunkr development by creating an account on GitHub.

We achieved this by bringing state of the art search technology (the best in dense and sparse vector embeddings) to academic research. Chunk my docs provides a self-hostable solution that leverages state-of-the-art (SOTA) vision models for segment extraction and OCR, unifying the output through a Rust Actix server. This setup allows you to process PDFs and extract segments at an impressive speed of approximately 5 pages per second on a single NVIDIA L4 instance, offering a cost-effective and scalable solution for high-accuracy bounding box segment extraction and OCR.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of PDF

PDF

Photo of Vision

Vision

Photo of vision model

vision model

Related news:

News photo

Apple Hosts Secretive Conferences to Teach Law Enforcement How to Better Use iPhone, CarPlay and Vision Pro for Police Work

News photo

Apple Launches 'Submerged' Short Film for Vision Pro, Outlines Upcoming Content

News photo

Apple’s Vision Pro leader, Dan Riccio, is retiring