How do you measure DevOps success? Talk “metrics” for DevOps work or cloud-native infrastructures, and the conversation tends to revolve around familiar operational and productivity measures. Uptime. Transactions per second. Bugs fixed. Commits. Familiar categories of data that are straightforward to track and would seem to at least passingly correlate with some combination of efficiency, health of the environment, and development speed.
It’s common for IT teams to instrument for and log such data – especially now that machine learning and other modern tools make it much easier to gain insights from large datasets, whether for pro-active, predictive analytics, or root cause analysis after a failure.
That's not a DevOps metric!
However, that doesn’t make everything a metric! A metric is properly thought of as a key performance indicator for data – a measurement that’s important to you in some fundamental way.
[Is your DevOps team set up in the best way? See our related article, DevOps success: A new team model emerges.]
The metrics that you actually want for DevOps work differ from the much larger set of data you might collect in four important ways.
1. There shouldn’t be too many DevOps metrics. Shoot for no more than ten, and fewer is probably better. Choosing a set of metrics out of all the possible things that you can measure is a deliberate selection process. At the same time, cast a wide net. Consider metrics that can uncover broader organizational or process health issues in addition to more obvious operational and development data that you can collect from your computer systems.
2. DevOps metrics should reflect what’s important to you. How do you define success? As noted by the MIT Center for Information Systems Research (CISR), digital transformation projects involve step changes on two dimensions: customer experience and operational efficiency. The relevant metrics will differ depending upon the dimension on which an organization has chosen to initially focus. If it’s customer experience, a metric such as Net Promoter Score might be appropriate. If it’s efficiency, more cost-centric measurements will be the better match.
3. DevOps metrics should be tied to business outcomes in some manner. You probably don’t really care about how many lines of code your developers write, whether a server had a hardware failure overnight, or how comprehensive your test coverage is. In fact, you may not even directly care about the responsiveness of your website or the rapidity of your updates. But you do care to the degree such metrics can be correlated with customers abandoning shopping carts or leaving for a competitor.
4. Seek a traceable path to root causes. It’s easy to come up with relevant business metrics. They’re the ones that you talk about on your earnings calls. But from the perspective of DevOps and cloud-native infrastructure metrics, there needs to be a traceable path to root causes that operations and development teams can affect. Choosing appropriate metrics are therefore something of a balancing act between being high level enough to have a more or less direct impact on business results while being sufficiently within IT’s purview that they can take direct actions to improve results.
Better DevOps metrics
Here are three examples of DevOps metrics that may be relevant for an organization.
- Customer ticket volume is a reasonable proxy for overall customer satisfaction, which, in turn, strongly affects higher-level (and highly valued) measures such as Net Promoter Score, the willingness of customers to recommend a company's products or services to others. At the same time, tickets tend to be filed for specific perceived shortcomings related to applications and infrastructure.
- A measurement such as percentage of failed deployments isn’t so directly tied to anything a customer sees. But it is indicative of process issues that can lead to visible failures or just wasted effort and time. Perhaps test coverage is lacking, or there’s just some systematic issue with the build and deployment pipeline.
- Or how about the job satisfaction of your development team? We talk a lot about culture when we’re discussing DevOps. Maybe it’s a good idea to have a metric or two that ties to how effectively collaboration is working or the effectiveness of your training and career paths.
Beware DevOps anti-patterns
An important caution: When you’re choosing metrics, be on the lookout for anti-patterns – metrics that encourage unproductive or otherwise negative behaviors or that simply aren’t meaningful.
As Daniel Ariely, one of the founders of behavioral economics, has noted: “Human beings adjust behavior based on the metrics they’re held against. Anything you measure will impel a person to optimize his score on that metric. What you measure is what you’ll get. Period.” Individual productivity measurements may discourage collaboration within and between teams. Simple output measures like the number of bugs fixed may not correlate well with more meaningful assessments of code quality or value delivered to customers.
Perhaps even more common are DevOps metrics collected because they’re easy to collect but don’t really mean anything. Failures that aren’t visible to customers may be indicative of a process issue – or they may simply demonstrate that the service is performing as designed.
By all means, collect data. Store it and analyze it. Increasingly that’s the default. But the metrics you choose to chart your path and measure your DevOps success should involve a deliberate decision.