Get the latest tech news

The illusion of state in state-space models


State-space models (SSMs) have emerged as a potential alternative architecture for building large language models (LLMs) compared to the previously ubiquitous transformer architecture. One theoretical weakness of transformers is that they cannot express certain kinds of sequential computation and state tracking (Merrill & Sabharwal, 2023), which SSMs are explicitly designed to address via their close architectural similarity to recurrent neural networks (RNNs). But do SSMs truly have an advantage (over transformers) in expressive power for state tracking? Surprisingly, the answer is no. Our analysis reveals that the expressive power of SSMs is limited very similarly to transformers: SSMs cannot express computation outside the complexity class $\mathsf{TC}^0$. In particular, this means they cannot solve simple state-tracking problems like permutation composition. It follows that SSMs are provably unable to accurately track chess moves with certain notation, evaluate code, or track entities in a long narrative. To supplement our formal analysis, we report experiments showing that Mamba-style SSMs indeed struggle with state tracking. Thus, despite its recurrent formulation, the "state" in an SSM is an illusion: SSMs have similar expressiveness limitations to non-recurrent models like transformers, which may fundamentally limit their ability to solve real-world state-tracking problems.

View a PDF of the paper titled The Illusion of State in State-Space Models, by William Merrill and Jackson Petty and Ashish Sabharwal One theoretical weakness of transformers is that they cannot express certain kinds of sequential computation and state tracking (Merrill & Sabharwal, 2023), which SSMs are explicitly designed to address via their close architectural similarity to recurrent neural networks (RNNs). To supplement our formal analysis, we report experiments showing that Mamba-style SSMs indeed struggle with state tracking.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of State

State

Photo of illusion

illusion

Photo of space models

space models

Related news:

News photo

The state of the art in copter drones and flight control systems

News photo

New theory suggests time is an illusion created by quantum entanglement

News photo

Everything announced at PlayStation's State of Play