Get the latest tech news
Backpropagating through a maze with candle and WASM
This demo uses gradient descent to solve a discrete maze. Try playing with the hyperparameters to see how they affect the optimization process! No neural network involved: logits are directly optimized, from a random initialization, for each maze.
Appearances can be deceiving: On harder and larger grids, you might find that much time is spent being "stuck", with a dramatic phase transition. This demo solves maze navigation by relaxing the discrete problem into a stochastic formulation that happens to be end-to-end differentiable. Since every operation is differentiable, we use backpropagation with standard automatic differentiation (i.e. candle's autograd, which runs client-side) to directly optimize action logits, without relying on e.g. the REINFORCE algorithm, Q learning, Monte-Carlo rollouts, or any sort of neural network.
Or read this on Hacker News