Get the latest tech news
Mamba-2 – State Space Duality
Homepage of Tri Dao. # A simple, whitespace theme for academics. Based on [*folio](https://github.com/bogoli/-folio) design.
It’s been incredibly gratifying to see the line of research on efficient sequence models we’ve been pursuing for years really resonate with the machine learning community and take off more than we could have anticipated. One of the main reasons for the selectivity (e.g. $A$ that depends on the input $X$) introduced in Mamba is to let the SSM be able to control whether to remember or ignore particular pieces of information; for example, if a filler “um” is encountered in a text transcript. First, remember that one main reason why SSMs are interesting to begin with is because computing \eqref{eq:ssm} as a recurrence requires maintaining a constant-size state(size $\mathtt{N}$ per channel) and scales linearly in the sequence length$\mathtt{T}$.
Or read this on Hacker News