Get the latest tech news

How to use Prometheus to efficiently detect anomalies at scale

Learn how we built a dependable, open source framework for anomaly detection that you can use today as part of your root-cause analysis workflow.

Selecting the time window was the biggest choice we had to make here, since there’s a tradeoff between how much your middle line is lagging behind your metric and the smoothing factor you have to apply. We found that one hour was the sweet spot, as the system is tuned for short-term anomaly detection (large deviations in small time frames). So the next step in making the data actionable is to tie the framework back to your pre-established, SLO-based alerts as part of your root-cause analysis workflow.

Get the Android app

Or read this on Hacker News