Get the latest tech news

Differentiable Programming from Scratch

programming has become a hot research topic, and not only due to the popularity of machine learning frameworks like TensorFlow, PyTorch, and JAX. Many fields apart from machine learning are finding differentiable programming to be a useful tool for solving optimization problems.

Which simply means “find the $$\mathbf{x}$$ that results in the smallest possible value of $$f$$.” The function $$f$$, typically scalar-valued, is traditionally called an “energy,” or in machine learning, a “loss function.” Extra constraints are often enforced to limit the valid options for $$\mathbf{x}$$, but we will disregard constrained optimization for now. Unfortunately, optimization problems in machine learning and graphics often have the opposite structure: $$f$$ has a huge number of inputs (e.g. the coefficients of a 3D scene or neural network) and a single output. Checkpointing gives us a natural space-time tradeoff: by strategically choosing which nodes store intermediate results (e.g. ones with expensive operations), we can reduce memory usage without dramatically increasing runtime.

Get the Android app

Or read this on Hacker News