Get the latest tech news
How AI on Microcontrollers Works: Operators and Kernels
The buzz around “edge AI”, which means something slightly different to almost everyone you talk to, is well past reaching a fever pitch. Regardless of what edge AI means to you, the one commonality is typically that the hardware on which inference is being performed is constrained in one or more dimensions, whether it be compute, memory, or network bandwidth. Perhaps the most constrained of these platforms are microcontrollers. I have found that, while there is much discourse around “running AI” (i.
I have found that, while there is much discourse around “running AI” (i.e. performing inference) on microcontrollers, there is a general lack of information about what these systems are actually capable of, and how new hardware advancements impact that equation. However, as evidenced by the effectively unlimited demand in the Graphics Processing Unit (GPU) market, there are significant performance gains to be had by parallelizing operations in hardware. We’ve now seen the full spectrum of operator optimization, from kernels that are implemented purely in C, to those that leverage hardware instructions provided in architecture extensions, and finally to those that offload inference to a wholly separate processor.
Or read this on Hacker News