In the agile community, we talk a lot about the need to accept or even encourage failure, and the advantages of “failing fast.” This talk of failure is uncomfortable to many who are trying to make the transition to agile and DevOps culture. IT team members, taught for years never to fail, struggle mightily with the notion that it’s now OK to fail. Leaders struggle too: “What would my board of directors say if I told them we were encouraging failure?” Or for a government agency, “What would Congress say?” or “We’re accountable to the public! We can’t deliberately set out to fail!”
I think the issue here is a matter of language. The agile community wants to emphasize how radical a departure this is from our traditional ways of thinking, so it pointedly uses the term “failure” somewhat for its shock value.
The agile community is right: I’ve seen too many organizations miss the key differences between the agile way of thinking and their legacy approaches. But the word “failure” is a bit imprecise. So let’s look more exactly at what the agile community means by failure.
[ For more agile leadership wisdom, see our related article, DevOps requires dumping old IT leadership ideas. ]
There are four primary types of failure that agile encourages. Let me call these (1) active research, (2) hypothesis testing, (3) innovation, and (4) bias for action.
You’ll notice that I worded these without using the term “failure.”
1. Active research:
Agile stresses the value of investigation by doing. We can learn passively – say, by reading a book – or actively, by trying deliberate experiments conceived with learning objectives in mind.
To take a technical example, let’s say that we are trying to choose between two different message queueing systems. A good way to choose is to quickly spin up some VMs and a test scenario in the cloud, and try out both messaging systems in a use case that is similar to our actual one. Based on the results – or just our observations – we choose one or the other.
Now you can say that one of our two experimental setups was a failure, since in the end we decided that that message queuing system would not meet our needs. True. But a better way to put it is that we reduced risk by gaining information that helped us make a decision. With minimal expenditure we were able to make a better platform decision. The net return, you might say, is positive, not a failure in the usual sense of the word.
2. Hypothesis testing:
Let’s say that someone in the company has a brilliant idea – a product that will solve a customer need, a new feature that will have a positive ROI, an IT process that will streamline a business process. Let’s even say that it has a great business case. (As I pointed out in my book, The Art of Business Value, it is misleading to say that the idea will have a positive ROI, because the business case at this stage must always rest on assumptions. The ROI calculation is valid to the extent that the assumptions are valid.)
It is therefore more appropriate to think of this new idea, or at least its expected impact, as a hypothesis. We don’t really know that it is a good idea – we just have reason to believe it is.
Now we have two choices: We can fully commit to it based on the business case, or we can think of the smallest experiment we can do to confirm the hypothesis.
The latter is a way of reducing risk, again cheaply. If the experiment fails to confirm the hypothesis, then we don’t waste the money building the entire new capability. Yes, you can say that the experiment failed. But that is not really correct: The experiment gave us information that let us avoid a failure.
3. Encouraging innovation:
We allow a number of ideas to move forward with small experiments that will “fail fast” if the new idea doesn’t work. The point is to encourage innovation rather than stifling it before it can take root. It is also a process based on humility: We don’t know for sure which ideas are going to work, so we allow them to prove themselves.
Rather than frame this in terms of failure, I’d call this a portfolio approach. We encourage a number of ideas with the expectation that a few of them will score, and score big.
It is, essentially, an early-stage venture capital model. The net return from the portfolio is largely positive. There is no real failure here.
4. Prioritizing action:
Finally, the last sense in which we use the term “failure” is as a way of encouraging a bias for action. Rather than long analysis, we move ahead quickly and accept the fact that we might be making a mistake. If so, we will try to catch it quickly and stop.
The economics here are a bit more subtle, but the best way to think of it is as a trade-off between the cost of analysis and delay and the cost of the work done until it stops.
The problem with the old way of thinking: We tended to ignore how much the analysis and delay cost us. I have seen government IT programs spend four years of teams documenting and analyzing ideas before deciding to proceed. The cost was millions of dollars out of pocket, plus four years of making do without a needed capability.
If we can establish a quick checkpoint where we validate the idea, then the risk of moving forward is relatively low compared to the cost of analysis and delay. We are weighing two costs – the cost of the idea not working out vs. the cost of analysis – and choosing the lesser of the two. I wouldn’t call that a failure.
Minimizing risk, increasing returns
In each of these cases, we could use the word “failure” to be dramatic: But a better way of looking at it is that we are minimizing risk and increasing returns.
Don’t tell your employees that you want them to fail. Tell them that you expect them to make good decisions by testing out ideas before committing to them, and by using actual evidence and data as the basis for making their decisions, just as I described in the last few paragraphs.
Similarly, don’t tell your board or Congress that you are deliberately failing – tell them that you are carefully managing risks and saving money.
Remember, we have introduced agile approaches in order to reduce risk and eliminate the costly failures of the waterfall model. Let's talk about what we are doing in a way that makes that clear.