Forecasting when a software project is going to be done is difficult. Nobody
disputes this. Software is complex. It’s path dependent. Even the best software
has components coupled in ways that are not easy to anticipate. And while we’re
often asked to complete tasks that are very similar to things we’ve done in the
past, we also frequently confronted with having to do things that are entirely
new. In short, there is tremendous uncertainty when trying to estimate software
completion dates.
Many in our industry shy away from making estimates for this reason. They know
from experience that precision is impossible, but are asked to predict the
future nonetheless. The most common estimation practices ignore this essential
fact. In planning poker, the team is expected to produce a single point/time
estimate for each task. A person wanting to know when a given user story will
be addressed adds all the individual estimates ahead of it and divides by the
team’s velocity. What could be simpler?
That simple calculation obscures the uncertainty behind the individual
estimates. It produces a result that suggests more precision than is warranted.
This is exacerbated when estimates are treated as commitments with punishment
meted out to those who fail to deliver accordingly. The end result is a whole
industry hesitant to make estimates at all. When forced to do it anyway, those
estimates frequently get padded to the point beyond where they’re no longer
useful.
Even though
we know we should do estimates,
when it’s time to estimate a whole project, we often struggle with our tools. There’s a better way.
Range Estimates
A Range Estimate captures and quantifies uncertainty in a fairly rigorous way.
The most common type of range estimate is called a 50/90 estimate. In this
case, each task in the project is given two estimates – an Aggressive But
Possible (ABP) one and a Highly Probable (HP) one. Let’s examine how it works
by way of example.
Suppose you were trying to estimate how long it was going to take to fill your
grocery cart with all of the ingredients for a new recipe you want to try. To
make it interesting, we’ll assume that the recipe involves you buying two
ingredients you get frequently – vegetable oil and sliced mushrooms – and one
ingredient you have never purchased – tamarind soup base.
To finish as quickly as possible, you make a plan that has you starting in the
front of the with the cash registers, only going down the aisles containing the
ingredients on the shopping list, and returning to the cash registers. You’ll
get done fastest if you don’t have to retrace your steps, so you order your
shopping list accordingly. First up is mushrooms which are in the produce
section right in front. Next is the tamarind soup base. Then comes the
vegetable oil.
Now it’s time to estimate the total shopping time by predicting how long it
will take to get each ingredient. As mentioned previously, each one gets two
estimates of varying degrees of confidence.
The first is the Aggressive But Possible (ABP) estimate. It is one that is as
likely to be wrong on the low side as it is on the high side. That is, you are
50% confident that you’ll complete the task within the estimated time. The
second estimate, called the Highly Probable (HP) estimate is one with much more
confidence. If you’re used to padding your estimates to boost confidence,
you’re used to making HP estimates. You want to be 90% confident that you’ll
complete a given task within the estimated time when making the HP estimate.
The Range Estimation technique understands that the truth frequently lies
in the middle. If you were to add all of the ABP task estimates, you’d arrive
at a number that is almost certainly too low for the collection. But if you sum
all the HP estimates and use that, you are going to be too high.
The spread between ABP and HP is the measure of uncertainty. For something
you’ve done before, that spread would probably be pretty small. For a task
whose requirements are still unclear or which requires a brand new technology
or component, the degree of uncertainty, the spread between ABP and HP, will be
much larger.
Returning to our example, the first ingredient to buy is sliced mushrooms. You
buy them all the time and know exactly where they are in the store. Your
Aggressive But Possible estimate is 2 minutes and your Highly Probable estimate
is 3 minutes. There’s not much uncertainty here in part because you know right
where they are. But you’re also starting from a known point.
Next up is the tamarind soup base. You’ve never purchased that before, but
you’re guessing that it’s with the other soups which are near the front. If
you’re right, it will take you 3 minutes after getting the mushrooms to find
and get the soup. But if you’re wrong, you have to go searching or asking for
help. Your path going from mushrooms to soup might look the diagram below. To
account for this undercertainty, you set your highly probable estimate to 10
minutes.
The final ingredient is the vegetable oil. In this case your ABP estimate is 3
minutes and your HP estimate is 6 minutes. Even though you know where it is in
the store, you’re a bit unsure where you’ll be starting from. Time spent
shopping, like software development, is path dependent.
If you assume that completion times for a given task follow a distribution that
is approximately normal,
then the ABP estimate represents the mean of the distribution. The HP estimate
falls approximately two standard deviations above that.
Calculating the Estimate
Estimating the time for an entire project turns out to be relatively straight
forward and solved statistical problem. We add to the sum of all ABPs a buffer
term. The buffer is calculated as the square root of the sum of squares for the
uncertainty intervals. This equates to establishing an overall estimate whose
eventual completion time should fall within a 90% confidence interval.
For our example, the sum of the ABPs is
`2 + 3 + 3 = 8` minutes
The buffer term is:
`sqrt( (3-2)^2 + (10-3)^2 + (6-3)^2 ) = 7.7 ` minutes
Adding them together yields an estimate of 15.7 minutes of shopping time.
Extending the Technique
The 50/90 estimate as normally practiced produces a single estimate – 15.7
minutes in our example. Even though the value is calculated in a way to
quantify uncertainty, communicating the result as a single number cuts against
the idea that our estimates still carry uncertainty.
For this reason, we’ve extend the technique by calculating optimistic and
pessimistic values that reflect the uncertainty in the underlying estimates. In
our toy example, the spread from optimistic to pessimistic time spent shopping
would be between 11.9 and 19.5 minutes.
Tooling
We do range estimates frequently. When we’re making a proposal to a client, we
give them an estimate which allows them to decide if a project is worth doing
and if we’re the appropriate partern. Once we’re in the middle of a project,
we are often asked for an updated forecast of when it will be done. In these
cases, we use range estimates to arrive at the estimate.
It’s unfortunate that more agile software estimate tools don’t provide support
for range estimates. There would be a lot less anxiety over having to make
forecasts if the process was understood to account for the uncertainty we all
feel. Because of the lack of tooling, we’ve built our own range estimation
tool. It takes the form of a spreadsheet that allows a team to work
collaboratively or in a distribute fashion to make estimates on all the tasks
and stories associated with a project. You are able to plug in addition degrees
of uncertainty such as the percentage of the project not yet specified – the
unknown unknowns of requirements.
We’re sharing that tool today in Google Sheets, Microsoft Excel, and Numbers
formats. We hope you like it and would welcome your feedback and stories on how
it’s helped you to make more realistic project estimates.
Conclusion
There’s no avoiding the fact that software estimation
is difficult. The Range Estimation technique is no panacea and it’s not
appropriate in all circumstances. But when you’re looking to make whole-project
estimates that both capture and communicate the amount of uncertainty in the
project, having this tool in your toolbox can improve both the transparency and
accuracy of the result.