Get the latest tech news
Searching a Codebase in English
Cosine similarity in code vs. text.
To retrieve, we start by generating a semantic vector embedding for the query, in this case, "Word storing frequently accessed key-value pairs". We then compare that against our database of vectors and find the one(s) that match the closest, i.e., have the lowest dot product and highest similarity. Surely you could split up a codebase into files or functions as the “chunks”, embed them, and do a similar semantic similarity-based search.
Or read this on Hacker News