Get the latest tech news

The Biology of a Large Language Model


models display impressive capabilities. However, for the most part, the mechanisms by which they do so are unknown.

In our companion paper, Circuit Tracing: Revealing Computational Graphs in Language Models, we build on recent work (e.g. ) to introduce a new set of tools for identifying features and mapping connections between them – analogous to neuroscientists producing a “wiring diagram” of the brain. We rely heavily on a tool we call attribution graphs, which allow us to partially trace the chain of intermediate steps that a model uses to transform a specific input prompt into an output response. These works primarily rely on the logit lens technique and component-level activation patching to show that models have an English-aligned intermediate representation, but subsequently convert this to a language-specific output in the final layers.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of large language model

large language model

Photo of biology

biology

Related news:

News photo

The Biology of B-Movie Monsters (2003)

News photo

Tracing the thoughts of a large language model

News photo

I built a large language model "from scratch"