Get the latest tech news

Falcon Mamba 7B’s powerful new AI architecture offers alternative to transformer models


In several benchmark, Falcon Mamba 7B convincingly outperformed Llama 3 8B, Llama 3.1 8B, Gemma 7B and Mistral 7B.

According to TII, its all-new Falcon model uses ​​the Mamba SSM architecture originally proposed by researchers at Carnegie Mellon and Princeton Universities in a paper dated December 2023. This way, the model can focus on or ignore particular inputs, similar to how attention works in transformers, while delivering the ability to process long sequences of text – such as an entire book – without requiring additional memory or computing resources. In a separate throughput test, it outperformed Mistral 7B’s efficient sliding window attention architecture to generate all tokens at a constant speed and without any increase in CUDA peak memory.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of transformer models

transformer models

Photo of Falcon Mamba 7B’s

Falcon Mamba 7B’s

Related news:

News photo

Etched looks to challenge Nvidia with an ASIC purpose-built for transformer models

News photo

Sohu – first specialized chip (ASIC) for transformer models