Get the latest tech news

Don't bother parsing: Just use images for RAG


If search is the game, looks matter

Queries that depend on spatial relationships (“which part of the diagram labels Q3 total?”) become impossible, embeddings for text and images live in separate spaces, and retrieval pipelines struggle to reconcile them. To validate these observations beyond anecdotal evidence, we worked with TLDC (The LLM Data Company) to build an open-source financial document benchmark with 45 challenging questions across NVIDIA 10-Qs, Palantir investor presentations, and JPMorgan reports. We're exploring how to combine our visual document retrieval with specialized knowledge graphs, how to build systems that can reason about causality and implication rather than just correlation, and how to provide the kind of confidence intervals and uncertainty quantification that enterprise applications demand.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of images

images

Photo of RAG

RAG

Related news:

News photo

Mkosi – Build Bespoke OS Images

News photo

Show HN: Improving search ranking with chess Elo scores

News photo

a16z-Backed AI Site Civitai Is Mostly Porn, Despite Claiming Otherwise | Data shows that the vast majority of images on Civitai were pornographic, and that the site hosted more than 50,000 AI models designed to recreate the likeness of real people.