Get the latest tech news
Trying Kolmogorov-Arnold Networks in Practice
here's been a fair bit of buzz about Kolmogorov-Arnold networks online lately. Some research papers were posted around claiming that they offer better accuracy or faster training compared to traditional neural networks/MLPs for the same parameter count.
I used a version of positional encoding to expand the coordinates from single scalars into small vectors to make it easier for the networks to learn high-frequency features. So the input vector of the layer gets passed through this base function element-wise, multiplied by yet another set of learnable weights of shape(in_count * out_count), and then added to the outputs of the splines. The main theme seems to be that the Adam optimizer is quite good at doing its job regardless of the computational graph it has to work with, and the most significant factor controlling performance is just parameter count.
Or read this on Hacker News