Get the latest tech news

The Matrix Calculus You Need for Deep Learning

Most of us last saw calculus in school, but derivatives are a critical part of machine learning, particularly deep neural networks, which are trained by optimizing a loss function. This article is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed.

Most of us last saw calculus in school, but derivatives are a critical part of machine learning, particularly deep neural networks, which are trained by optimizing a loss function. A quick look at the data flow diagram for shows multiple paths from x to y, thus, making it clear we need to consider direct and indirect (through ) dependencies on x: The beauty of the vector formula over the single-variable chain rule is that it automatically takes into consideration the total derivative while maintaining the same notational simplicity.

Get the Android app

Or read this on Hacker News