Get the latest tech news

Maximum likelihood estimation and loss functions

When I started learning about loss functions, I could always understand the intuition behind them. For example, the mean squared error (MSE) for regression seemed logical—penalizing large deviations from the ground-truth makes sense. But one thing always bothered me: I could never come up with those loss functions on my own. Where did they come from? Why do we use these specific formulas and not something else? This frustration led me to dig deeper into the mathematical and probabilistic foundations of loss functions.

In this blog, I’ll take you through this journey, showing how these loss functions are not arbitrary but derive naturally from statistical principles. Many people interpret maximum likelihood estimation as finding the parameters that make the observed data most likely under the assumed model. Throughout this blog, we’ve explored the importance of Maximum Likelihood Estimation (MLE) as a foundation for deriving many widely used loss functions in machine learning.

Get the Android app

Or read this on Hacker News