Get the latest tech news
Nixiesearch: Running Lucene over S3, and why we're building a new search engine
A new search engine in 2024? Yes, but stateless — index on S3, serverless — no cluster state, with all Lucene features — filters, autocomplete, facets. And also with local embedding & RAG inference.
The prize wheel above summarises author’s personal incident experience with Elasticsearch, OpenSearch and SOLR — but other modern vector search engines such as Weaviate and Qdrant are not immune. There have been a number of experiments back in the days to add S3 support to Lucene, starting from 15 years ago by Shay Banon(a creator of Elasticsearch, and now a CEO of Elastic): A common pain point of existing search engines is the need for a complicated ML-driven indexing pipeline to compute custom embeddings, do text processing and so on.
Or read this on Hacker News