Get the latest tech news

A visual exploration of vector embeddings

For Pycon 2025, I created a poster exploring vector embedding models, which you can download at full-size . In this post, I'll translate ...

I found that spike in every single vector embedding generated from the model - short ones, long ones, English ones, Spanish ones, etc. However, once we start growing our vector database size, we typically need to use an Approximate Nearest Neighbors (ANN) algorithm to search the embedding space heuristically. AlgorithmPython packageExample database supportHNSWhnswlib PostgreSQL pgvector extension Azure AI Search Chromadb Weaviate DiskANNdiskannpyCosmos DBIVFFlatfaissPostgreSQL pgvector extensionFaissfaissNone, in-memory index only* When our database grows to include millions or even billions of vectors, we start to feel the effects of vector size.

Get the Android app

Or read this on Hacker News