Get the latest tech news

Beyond transformers: Nvidia’s MambaVision aims to unlock faster, cheaper enterprise computer vision

Nvidia is updating its computer vision models with new versions of MambaVision that combine the best of Mamba and transformers to improve efficiency.

The architecture’s innovation lies in its redesigned Mamba formulation specifically engineered for visual feature modeling, augmented by strategic placement of self-attention blocks in the final layers to capture complex spatial dependencies. “Since the initial release, we’ve significantly enhanced MambaVision, scaling it up to an impressive 740 million parameters,” Ali Hatamizadeh, Senior Research Scientist at Nvidia wrote in a Hugging Face discussion post. Independent AI consultant Alex Fazio explained to VentureBeat that the new MambaVision models’ training on larger datasets makes them much better at handling more diverse and complex tasks.

Get the Android app

Or read this on Venture Beat