Get the latest tech news
Bamba: An open-source LLM that crosses a transformer with an SSM
The open-source LLM combines the sequence-modeling skill of a transformer with the inferencing speed of an SSM. IBM Granite will soon adopt key Bamba features.
“They are the bread and butter of electrical engineering — signal processing, robotics, and control theory,” says Ankit Gupta, an IBM researcher who has played a key role in adapting SSMs to deep learning. In 2023, Gu, then a professor at CMU, and Tri Dao, at Princeton, unveiled a gated SSM variant — Mamba2, which helped to inspire a wave of hybrids, with names like Samba and MambaFormer. “Virtual” LLM has emerged as the go-to open-source inference server for LLMs, and the team behind Bamba worked closely with Red Hat to integrate the model into the platform.
Or read this on Hacker News