Get the latest tech news

Evaluating RAG for large scale codebases

Read how Qodo evaluates RAG systems to optimize generative AI coding for large-scale codebases using LLM-as-a-judge and regression testing

We wrapped up the workflow in a lightweight CLI tool which we can run locally as well as in the CI, which performs prediction, evaluation, and stores the results in an experiment tracking system. Using these tools, we’ve dramatically decreased the effort involved in verifying if code changes led to unforeseen quality issue, from hours of manual probing to minutes for automated regression tests. RAG systems for large scale codebases are a core capability used across many of Qodo’s products and, as such, require a robust evaluation and quality mechanism.

Get the Android app

Or read this on Hacker News