Get the latest tech news
Researchers upend AI status quo by eliminating matrix multiplication in LLMs
Running AI models without matrix math means far less power consumption—and fewer GPUs?
The technique has not yet been peer-reviewed, but the researchers—Rui-Jie Zhu, Yu Zhang, Ethan Sifferman, Tyler Sheaves, Yiqiao Wang, Dustin Richmond, Peng Zhou, and Jason Eshraghian—claim that their work challenges the prevailing paradigm that matrix multiplication operations are indispensable for building high-performing language models. According to the authors, BitNet demonstrated the viability of using binary and ternary weights in language models, successfully scaling up to 3 billion parameters while maintaining competitive performance. Limitations of BitNet served as a motivation for the current study, pushing them to develop a completely "MatMul-free" architecture that could maintain performance while eliminating matrix multiplications even in the attention mechanism.
Or read this on Hacker News