Get the latest tech news

Train faster static embedding models with sentence transformers

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Large encoder models with a lot of attention layers will be effective at using the context to produce useful embeddings, but they do so at a high price of slow inference. Because these static embedding models are extremely small, it is possible to fit our desired batch size of 2048 samples on our hardware: a single RTX 3090 with 24GB, so we don't need to use CMNRL. Because Static Embedding-based models aren't bottlenecked by positional embeddings or superlinear time complexity, they can have arbitrarily high maximum sequence lengths.

Get the Android app

Or read this on Hacker News