Get the latest tech news

Mamba-2 – State Space Duality


Homepage of Tri Dao. # A simple, whitespace theme for academics. Based on [*folio](https://github.com/bogoli/-folio) design.

It’s been incredibly gratifying to see the line of research on efficient sequence models we’ve been pursuing for years really resonate with the machine learning community and take off more than we could have anticipated. One of the main reasons for the selectivity (e.g. $A$ that depends on the input $X$) introduced in Mamba is to let the SSM be able to control whether to remember or ignore particular pieces of information; for example, if a filler “um” is encountered in a text transcript. First, remember that one main reason why SSMs are interesting to begin with is because computing \eqref{eq:ssm} as a recurrence requires maintaining a constant-size state(size $\mathtt{N}$ per channel) and scales linearly in the sequence length$\mathtt{T}$.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of State Space Duality

State Space Duality

Photo of Mamba-2

Mamba-2