Why your estimates are probably wrong (and how to fix that)

Matt Stephenson
Jun 3
6 min read

If you’ve worked in software delivery, you’ll recognise this tension:

Stakeholders want a date.

The team resists committing to one. They say, “We can’t give you a date because we’re agile”

The problem is… the rest of the business still runs on dates and needs you to forecast one.

Marketing needs to plan campaigns. Finance needs forecasts. Customers need commitments.The business needs to know when the team will be available to work on something else.

So “we’ll deliver when we deliver” isn’t really an answer that will wash.

So, this raises the question, “How do you create a credible forecast in an environment full of uncertainty?”

We can’t give you a date because we’re agile”

The flaw in traditional estimating

Traditional effort-based estimation requires you to break the work down, estimate how long each task will take, and add it all up.

Great on paper and in theory, but flawed in practice. Because accurate effort estimation depends on things you don’t yet know:

All of the work (which is hard to know at a task level at the outset)
Who will do the work (because individual skill is a factor at a task level)
When the work will get done (because most teams get faster as they progress)

So you either spend too long analysing work you don’t fully understand yet (and you definitely still won’t have identified it all), or you lean heavily into contingency, which is really just an allowance of time against a guess of how many tasks you don’t know about yet.

Either way, confidence in the final date is low, and history shows us that the success rate for enterprise-scale projects delivering to time and budget is very low.

A different way to think about it

Instead of asking “What are all the tasks and how long will each take?”, ask yourself “Do we understand the feature requirements (scope) at a level of completeness?”

My argument is that you can probably gain a higher level of confidence that you’ve identified all of the feature requirements at the outset than the level of confidence you’d have that you’ve identified all of the tasks.

I believe, and it’s borne out by experience, that human beings are better at comparing

the size and complexity of things relative to one another than they are at putting an absolute size on an individual thing.

That’s the basis of relative complexity estimating.

When it comes to forecasting the end of a body of work, once you’re reasonably happy you’ve scoped it out correctly, ask of each feature, “Is this feature more or less complex than that other feature?”, and give it a score.

You size work using a simple scale (e.g. 1, 2, 3, 5, 8, 13), based on how complex each item is relative to others.

This type of Fibonacci number sequence is often used because its widening gaps force clearer relative comparisons, helping teams distinguish meaningfully between levels of complexity rather than agonising to achieve over-precision.

With relative complexity estimating, you don’t need a detailed task breakdown, named resources or a known sequence of work.

The reason you don’t need to know who will do the work or when is because individual skill differences and team acceleration naturally wash out and are captured in the team’s improving work rate over time.

You just need enough items to compare, and a shared view of what “more complex” means.

A quick note for those in the Agile world

At this point, some of you will be thinking, “this is just story points and velocity.”

And you’d be right. Those of you working in Agile environments will recognise this immediately. But that’s not really the point.

Because this isn’t about Agile, per se. It’s about forecasting. The principle is simple:

You don’t need perfect effort estimates to make a credible forecast. You need enough comparable items, and a consistent approach to sizing them.

Agile teams often use story points and velocity.

But the idea works anywhere you have:

A meaningful number of comparable items
A consistent way of sizing them
A way to track progress over time

What it doesn’t work for is a plan made up of a handful of large, dissimilar chunks. You need enough volume to create a pattern.

Turning complexity into a forecast

Let’s say your total scope adds up to 600 points, and you’re reasonably sure you’ve identified all of the features needed in the solution (not the tasks, the features).

You’ve got 40 weeks before the customer would like to take receipt of the finished product, and you’re going to work in 2 week iterations. So, you have 20 iterations, which means you need to average 30 points per iteration.

So from day one, you can say “If we average 30 points per iterations, and if the feature list remains as we understand it today, we’ll deliver on or around your target date”.

Then you crack on with delivery. Each iteration gives you real data:

Points delivered.
Points remaining.
Revised required work rate to hit the aspirational date.

After each iteration you can revisit your forecast and declare whether you remain on track, you’re ahead or you’re behind.

What this looks like in practice

I introduced this approach on a large platform build for a client.

An aspirational delivery date 15 months in the future (including hiring most of the team).
70% of the team were not in place at the start.
Significant new platform capability.

Before we started we established:

The required velocity was 35 points per iteration.
The aspirational date was on or around end of March.

We set to work, and the first few iterations were volatile:

Iteration 1 delivered zero points.
Early work took longer than expected.
Velocity fluctuated significantly from iteration to iteration in the first 6 or so.

After six iterations the required velocity per iteration had climbed to 46 points.

But we knew this early. If we had used traditional estimating, slower delivery on the earlier work items would not have given us as much meaningful information about the forecast for the rest of the body of work.

So the business made a decision to increase capacity.

Good forecasting doesn’t necessarily give you any greater certainty, but it does facilitate earlier intervention than if traditional estimating with effort against tasks had been employed.

The moment the team lost confidence

About halfway through, the team started to doubt their estimates. They felt they were consistently underestimating complexity. So I analysed the data across the first 99 completed stories (features).

What it showed was simple:

Smaller items consistently took less time
Larger items consistently took more time
The relationship held across the dataset

In the first half (chronologically) of the 99 tickets, the average number of days to complete a 3 point ticket, for example, was 19 days and in the second half it was only 8 days. But the relative complexity assessment was accurate througout.

In other words, the team was very good at assessing relative complexity right from the outset, and throughout, contrary to their own beliefs.

The issue wasn’t estimation. It was perception. A small number of tricky tickets had skewed their self confidence, but the data proved they were actually pretty good.

What actually improved over time

The beauty of relative complexity is that as the team progresses and gets faster, the complexities don’t change, but the acceleration washes out in the work rate (velocity), so for example where you start needing 35 points per iteration and just about achieve it, by the end of the period, you might be getting through 45 points, reducing the average needed per iteration and potentially finishing early.

The team accelerates as they:

Built familiarity
Gained confidence
Worked together more effectively

Their delivery rate increased significantly. Early work was slower. Later work was faster.

And that’s extremely difficult to model upfront with effort-based estimation.

The outcome

We delivered:

Only around 3 weeks later than the original target
On a 15 month programme
With most of the team hired during delivery

The movement in the date wasn’t a forecasting error, it was largely change:

Scope change
Stakeholder feedback

Stakeholders were delighted. If you’d offered me only a 3 week movement at the start,

I’d have taken it every time.

Why this works

This approach shifts the conversation from:

“How long will this take?”

To:

“Are we on track?”

And that brings some real advantages:

Faster, lighter estimation
Less reliance on unknowns
Early visibility of risk
Continuous confidence tracking
Better stakeholder conversations

Most importantly:

It allows you to act early while you still have options, rather than having to react late (as you generally do with effort-based estimation, where your only real lever extra-hours through overtime)

Final thought

You don’t remove uncertainty in software delivery through relative complexity estimating. But because you don’t need to identify every task or lean heavily into contingency (you can use contingency by assuming you’ve yet to identify a percentage of the features), you do reduce the uncertainty, and also create the language for talking about the forecast being adrift.

”At our current work rate and given the features that have been added to the body of work, we are not going to deliver by March next year. What shall we do about that?”

Effort-based estimation tries to eliminate uncertainty upfront. Relative complexity accepts it… and manages it over time.

In my experience, one of those approaches consistently leads to better outcomes.

Over to you

How are you forecasting delivery today, and how confident are you in it?