Get the latest tech news
Differentiable Programming from Scratch
programming has become a hot research topic, and not only due to the popularity of machine learning frameworks like TensorFlow, PyTorch, and JAX. Many fields apart from machine learning are finding differentiable programming to be a useful tool for solving optimization problems.
Which simply means “find the $$\mathbf{x}$$ that results in the smallest possible value of $$f$$.” The function $$f$$, typically scalar-valued, is traditionally called an “energy,” or in machine learning, a “loss function.” Extra constraints are often enforced to limit the valid options for $$\mathbf{x}$$, but we will disregard constrained optimization for now. Unfortunately, optimization problems in machine learning and graphics often have the opposite structure: $$f$$ has a huge number of inputs (e.g. the coefficients of a 3D scene or neural network) and a single output. Checkpointing gives us a natural space-time tradeoff: by strategically choosing which nodes store intermediate results (e.g. ones with expensive operations), we can reduce memory usage without dramatically increasing runtime.
Or read this on Hacker News