Get the latest tech news

Google’s new neural-net LLM architecture separates memory components to control exploding costs of capacity and compute


Titans architecture complements attention layers with neural memory modules that select bits of information worth saving in the long term.

However, the Google researchers argue that linear models do not show competitive performance compared to classic transformers, as they compress their contextual data and tend to miss important details. To fill the gap in current language models, the researchers propose a “neural long-term memory” module that can learn new information at inference time without the inefficiencies of the full attention mechanism. With LLMs supporting longer context windows, there is growing potential for creating applications where you squeeze new knowledge into your prompt instead of using techniques such as RAG.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Google

Google

Photo of LLM

LLM

Photo of costs

costs

Related news:

News photo

Google won't add fact-checks despite new EU law

News photo

Test-driven development with an LLM for fun and profit

News photo

Google reports halving code migration time with AI help