Get the latest tech news

Matryoshka Representation Learning with CLIP

TL;DR We introduce Matryoshka Representation Learning (MRL), facilitating flexible embedding sizes in vector databases. This allows a balance between efficiency and granularity. Through MRL, embeddings condense into smaller dimensions while preserving performance in retrieval and ranking tasks. In summary, MRL empowers cost-effective flexibility without compromising performance in multimodal retrieval and ranking tasks.

TL;DR We introduce Matryoshka Representation Learning (MRL), facilitating flexible embedding sizes in vector databases. In this blog, we trained Generalized Contrastive Learning (GCL), our model that extends CLIP to allow multiple representations for a sample, with MRL on a subset of GS-Marqo-10M. Additional projection layers may or may not be beneficial: The original paper trained a separate linear classifier head for each sub-dimension.

Get the Android app

Or read this on Hacker News