Get the latest tech news

Show HN: Kreuzberg – Modern async Python library for document text extraction


A text extraction library supporting PDFs, images, office documents and more - Goldziher/kreuzberg

Simple and Hassle-Free: Clean API that just works, without complex configuration Local Processing: No external API calls or cloud dependencies required Resource Efficient: Lightweight processing without GPU requirements Lightweight: Has few curated dependencies and a minimal footprint Format Support: Comprehensive support for documents, images, and text formats Modern Python: Built with async/await, type hints, and functional first approach Permissive OSS: Kreuzberg and its dependencies have a permissive OSS license Kreuzberg was built for RAG (Retrieval Augmented Generation) applications, focusing on local processing with minimal dependencies. Ordered results Concurrent processing Error handling per item Async and sync interfaces Same options as single extraction

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Kreuzberg

Kreuzberg