Get the latest tech news
Show HN: I Made an Open Source Platform for Structuring Any Unstructured Data
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks - adithya-s-k/omniparse
Whether working with documents, tables, images, videos, audio files, or web pages, OmniParse prepares your data to be clean, structured, and ready for AI applications, such as RAG, fine-tuning, and more. OmniParse aims to be an ingestion/parsing platform where you can ingest any type of data, such as documents, images, audio, video, and web content, and get the most structured and actionable output that is GenAI (LLM) friendly. 🦙 LlamaIndex | Langchain | Haystack integrations coming soon 📚 Batch processing data ⭐ Dynamic chunking and structured data extraction based on specified Schema 🛠️ One magic API: just feed in your file prompt what you want, and we will take care of the rest 🔧 Dynamic model selection and support for external APIs 📄 Batch processing for handling multiple files at once 📦 New open-source model to replace Surya OCR and Marker
Or read this on Hacker News