Get the latest tech news
PyTorch 101: Understanding Graphs, Automatic Differentiation and Autograd
In this article, we dive into how PyTorch’s Autograd engine performs automatic differentiation.
Backward pass is a bit more complicated since it requires us to use the chain rule to compute the gradients of weights w.r.t to the loss function. In order to compute derivatives in our neural network, we generally call backward on the Tensor representing our loss. You can undo this non-leaf buffer destroying behaviour by adding retain_graph = True argument to the backward function.
Or read this on Hacker News