Get the latest tech news

Bamba: An open-source LLM that crosses a transformer with an SSM


The open-source LLM combines the sequence-modeling skill of a transformer with the inferencing speed of an SSM. IBM Granite will soon adopt key Bamba features.

“They are the bread and butter of electrical engineering — signal processing, robotics, and control theory,” says Ankit Gupta, an IBM researcher who has played a key role in adapting SSMs to deep learning. In 2023, Gu, then a professor at CMU, and Tri Dao, at Princeton, unveiled a gated SSM variant — Mamba2, which helped to inspire a wave of hybrids, with names like Samba and MambaFormer. “Virtual” LLM has emerged as the go-to open-source inference server for LLMs, and the team behind Bamba worked closely with Red Hat to integrate the model into the platform.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Transformer

Transformer

Photo of source LLM

source LLM

Photo of SSM

SSM

Related news:

News photo

Bezos-backed startup designed an EV that can change like a ‘Transformer’

News photo

The Solid-State Shift: Reinventing the Transformer for Modern Grids

News photo

Go-attention: A full attention mechanism and transformer in pure Go