Get the latest tech news

Building reliable systems out of unreliable agents

This is the process our engineering team uses to create reliable AI systems out of unreliable AI agents.

Grab your favorite LLM-provider client (or just use their API — there are some good reasons to stick with HTTP) and integrate it into your product in the most minimal way possible. It’s also possible you won’t be able to use a single metric if you’re evaluating fuzzy properties of your answers, in which case you can at least look at what breaks after each change and make a judgment call. In the end, we used BigQuery to get data out, OpenAI for producing embeddings, and Pinecone for storage and nearest-neighbor search because that was the easiest way for us to deploy something without setting up a lot of new infrastructure.

Get the Android app

Or read this on Hacker News