Over the last few weeks, I gave demo of our product to a few friends and acquaintances. These people range from engineers to product managers to analysts to non-technical business folks. One of the key takeaways from these conversations was that most people equated anomalies with rule-based alerts – an alert that triggers when a metric goes outside a specified range. For example, alert when CPU utilization is greater than 90%.
If you do a Google Search for anomalies, you get the following result from its dictionary:
something that deviates from what is standard, normal, or expected.
How do you define what is normal or expected?
Let’s do a small test and see if we can visually identify abnormalities, or anomalies, in data.
The chart below shows hourly data for a real metric for 30 days. Visually, it looks like there are two anomalies – one on 24-Feb and another on 3-Mar.
Let’s look at the same metric with granularity as daily, instead of hourly. Looking at this view, 24-Feb doesn’t seem to be an anomaly. Seems like there’s only one anomaly on 3 Mar.
Now let’s change the duration of the data from 30 days to 180 days. There are spikes every few days. Is that normal for the metric? May be the metric spikes at the start of every month. But the spikes don’t seem to be evenly spaced out. Is 3-Mar still an anomaly?
Now let’s look at this 180 day data on a weekly basis. Is the week of 3-Mar an anomaly? Doesn’t seem so.
What we just saw is that there’s no one normal for a metric. What’s normal depends on at least two factors:
- Granularity of data (hourly, daily, weekly, …)
- Amount of historical data
Additionally, with every change in the underlying data, the normal changes as well. What’s an anomaly today might not have been considered an anomaly 6 months ago.
On a separate note, below is the Nasdaq Composite index since 1981. Can you visually find the anomalies? Dot-com bubble burst of 2001? Financial crisis of 2008? Covid pandemic of 2020?