This skewed distribution of incident durations results in a fuzzy mean that is less helpful that you might hope, especially if you’re comparing means to tell a story about your incident response capability over time. The VOID report 2022 describes a Monte Carlo simulation approach that demonstrates this, with real world incident duration data.
Can I just use the Median instead?
Not really. The median is less influenced by ‘outliers’ than the mean, so it may give you a more representative picture of a typical incident duration compared to the mean. However, when comparing TTR changes over time, it’s still unlikely to tell a useful story. In incident response, it’s the outliers that we really care about. Do we really want a stat that renders that 2 day monster outage invisible?
Sample sizes tend to be low
Statistically, distributions tend to become ‘more normal’ as the sample size grows. However most organisations thankfully suffer too few incidents for the duration distribution to approach normality. Given a choice between an informative MTTR or fewer incidents, which would you choose?
Duration != Severity
Even if MTTR did paint an accurate picture of how incident durations were changing over time, it doesn’t necessarily tell an accurate story about incident impact. Again, the VOID report 2022 demonstrates a poor correlation between duration and severity, so it’s possible that your MTTR could be decreasing while impact is increasing.
A useful measure reduces uncertainty in decision making. What might you do if your MTTR increased? How about if it reduced? Unfortunately, MTTR is unlikely to help inform your decision making. Improving your incident response is unlikely to manifest in the mean, so you could be doing a great job, or a poor job and your MTTR won’t tell you. Or worse, it’ll tell you something and you’ll go looking for the cause, and you won’t find it.
So what might you use instead of, or in addition to MTTR? That’ll be be the subject of the next post.
The void report 2022
incident metrics in SRE – Štěpán Davidovič
How to Measure Anything: Finding the Value of Intangibles in Business – Douglas W.Hubbard