Get the latest tech news

Muvera: Making multi-vector retrieval as fast as single-vector search

UVERA: Making multi-vector retrieval as fast as single-vector search June 25, 2025 Rajesh Jayaram and Laxman Dhulipala, Research Scientists, Google Research We introduce MUVERA, a state-of-the-art retrieval algorithm that reduces complex multi-vector retrieval back to single-vector maximum inner product search. Neural embedding models have become a cornerstone of modern information retrieval (IR).

Complex and compute-intensive similarity scoring: Chamfer matching is a non-linear operation requiring a matrix product, which is more expensive than a single vector dot-product. Our experiments demonstrate that MUVERA consistently achieves high retrieval accuracy with significantly reduced latency compared to the previous state-of-the-art method known as PLAID. Our work opens up new avenues for efficient multi-vector retrieval, which is crucial for various applications, including search engines, recommendation systems, and natural language processing.

Get the Android app

Or read this on Hacker News