Measuring Security Burndown Part 1
TL;DR A few months ago, I introduced BOOM (Baseline Objectives and Optimization Measurement) for DevSecOps, In this blog, I'll set the theoretical stage and rationale for measuring Burndown Rates. They are the first of several DevSecOps metrics for learning if your capabilities are getting better or worse over time. Most importantly, they allow you to measure if your efforts to shift security left actually impact what you see showing up in production.
Turtles Are Blocking My Metrics!
In “A Brief History Of Time” by Stephen Hawking, Hawking recalls an interesting situation that happened to Bertrand Russell during one of his lectures on our solar system. Russell was explaining gravitational pull starting with our planet’s relationship to the sun and in turn how our solar system was related to the larger galaxy. A little old lady in the audience took issue with that concept, and interrupted:
"What you have told us is rubbish. The world is really a flat plate supported on the back of a giant tortoise."
Upon hearing this, Professor Russell asked her what was supporting that tortoise in space.
Her response: “You're a very clever young man, very clever. But, it’s turtles all the way down!”
In cybersecurity, I want to move past the turtles. They represent layers of myths and excuses that can block security measurement, improvement, and acceleration. As a security practitioner (and leader), I don’t have the luxury of throwing up my hands in the face of complexity saying, “It’s impossible to measure!” Or, “we don’t have enough data!” Or, “We haven’t hired a metrics person yet.” These are all versions of, “It’s turtles all the way down!” The best way to bust through these measurement myths is to simply start measuring. We will begin our journey with burndown rates.
The What and Why of Burndown Rates
A burndown rate measures items coming on and off a backlog. A backlog is a queue of work. Adding issues to the queue is “burnup.” Removing them is “burn down.” Our goal is to baseline the rate with which burndown happens. We want to know if this rate is changing fast enough and in the right direction in relationship to our goals.
I’m starting with backlog burndown because it is simple to understand, and the data is accessible through common security tools and ticketing systems. Measuring burndown also paves the way to more advanced forms of measurement you will encounter in BOOM for DevSecOps.
Runtime Reveals Buildtime Efficiency
At my day gig, our focus is injecting security into the continuous integration/continuous deployment (CI/CD) pipeline – fixing security issues predominantly in development (in buildtime), before issues can wreak havoc in production (in runtime). While burndown rates measure how risk is managed post-deploy, they are a key metric for determining if the things you are doing in buildtime are effective because they are literally having a measurable effect.
With DevSecOps, your ultimate security goal is to reduce the frequency and time to live of issues in runtime.The rate with which issues come and go in production are a function of what happened (or didn’t happen) earlier in the development pipeline. Over time, we should be able to measure the effects of buildtime security improvements in runtime.
BOOM metrics are focused on detecting these changes – even if ever so slightly – to proactively reduce risk. It’s a capability efficiency view on measuring and managing DevSecOps risk from buildtime to runtime. Without a capability efficiency point of view on burndown, we are subjugated to a “whack-a-mole” approach to fixing things – what some call “the hamster wheel of pain.” It’s an endless process of finding and fixing constantly appearing problems: lather, rinse, repeat. Don’t get me wrong, fixing is necessary work. But we have the opportunity to find out, “Are our capabilities getting better or worse over time?” and make improvements that reduce risk.
Getting Certain About Uncertainty
Our journey to better starts with a simple ratio: the cumulative count of security risks removed over the total: Removed / (Removed + Remaining). This turns out to be an average. Beware, averages can lie! Single point measurements, like averages, drop lots of information and will amplify our uncertainty. That is why you must also measure uncertainty. You will want to know how confident you should be in your results given fluctuating run and buildtime capabilities.
As data rolls in, our beliefs about the measured rate of the process continuously updates. And as we see more data over time, our uncertainty about that rate starts to baseline. As our uncertainty shrinks, and rates exceed risk thresholds, we can take action with more confidence.
What does uncertainty look like from a metrics perspective? Maximally, it looks like the graph above. The graph says that all rates are equally plausible. This means that a 2% burndown rate is just as likely as a 99% burndown rate. One of the chief goals of measurement is to quickly move away from being maximally uncertain.
Now imagine that in one week, ten things are added and three are taken away from your backlog. We may be tempted to say we have a 30% burndown rate. We actually don’t know that. There is a “data-generating process” (aka capability) that creates and removes issues. It takes time and data to baseline that ever changing process. For the moment, it has created a small amount of data. We can see what that little data tells us about the true burndown rate here:
The two vertical bars above represent a fancy thing called the “Highest Density Interval,” or HDI. HDIs help us place our bets when we are uncertain. Technically, we would say the HDI is the region within the curve (in purple) that holds the most value. This graph says our burndown rate has a 89% probability of being somewhere between those two bars. (You will run across the HDI and various other machinations as you go further in BOOM.)
Let’s assume a month has gone by, and development has been busily fixing vulnerabilities and configuration issues. What might 140 issues fixed with 79 remaining look like? The simple average is 64%. The reality is that, given that amount of data, we still have a bit of uncertainty about the baselining backlog burndown rate. Depending on your point of view and your goals, you might consider this to be far too much uncertainty.
Up Next: Backlog Data
Do you think all this security uncertainty stuff is splitting hairs? It’s not. It’s skepticism in action. If you don’t account for uncertainty you will have trouble knowing if you're really improving. You will have trouble detecting real change. Specifically, you won’t know how fast your rates are accelerating or decelerating in relation to your goals (KPIs). Or worse, you may be fooled into believing you achieved a goal when you haven't.
You may also be tempted to think the data necessary to perform this type of analysis is complex, but it’s not! You likely already have it in a raw form downloadable from any one of a number of tools. You would want something like: bug_name, open_date, closed_date, owner_id etc.
In Part II we will dive into how to analyze burndown data. It will include many more graphs and code you can run – even if you don’t code – and much more.
As always, feel free to reach out if you have any questions. We’re currently building a solution that lowers risk by making it easier to deploy secure, reliable code, and we’d be happy to show it to you.