Get the latest tech news
Microsoft BitNet: inference framework for 1-bit LLMs
Official inference framework for 1-bit LLMs. Contribute to microsoft/BitNet development by creating an account on GitHub.
bitnet.cpp achieves speedups of 1.37x to 5.07x on ARM CPUs, with larger models experiencing greater performance gains. The tested models are dummy setups used in a research context to demonstrate the inference performance of bitnet.cpp. We also thank T-MAC team for the helpful discussion on the LUT method for low-bit LLM inference.
Or read this on Hacker News