Get the latest tech news

Show HN: I Made an Open Source Platform for Structuring Any Unstructured Data


Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks - adithya-s-k/omniparse

Whether working with documents, tables, images, videos, audio files, or web pages, OmniParse prepares your data to be clean, structured, and ready for AI applications, such as RAG, fine-tuning, and more. OmniParse aims to be an ingestion/parsing platform where you can ingest any type of data, such as documents, images, audio, video, and web content, and get the most structured and actionable output that is GenAI (LLM) friendly. 🦙 LlamaIndex | Langchain | Haystack integrations coming soon 📚 Batch processing data ⭐ Dynamic chunking and structured data extraction based on specified Schema 🛠️ One magic API: just feed in your file prompt what you want, and we will take care of the rest 🔧 Dynamic model selection and support for external APIs 📄 Batch processing for handling multiple files at once 📦 New open-source model to replace Surya OCR and Marker

Get the Android app

Or read this on Hacker News

Read more on:

Photo of unstructured data

unstructured data

Photo of open source platform

open source platform

Related news:

News photo

Trellis (YC W24) is hiring eng to turn documents into database

News photo

Trellis (YC W24) is hiring Founding Eng to build ETL for unstructured data