Long Live Estimates! (Story Points Part II)

In my last post, I argued that estimating story points for the purpose of predicting velocity adds no predictive power to the art of forecasting Sprint performance or release dates.

Image of a linear regression graph
Estimation…It’s not an exact science.

However, this isn’t to say that the process of estimating story points lacks value. Because everyone on the team is accountable for the accuracy of the forecast in the Sprint Plan, and they have to agree on the final outcome, they have an incentive to actively participate in the discussion about the story in order to plan how it will get done, and to identify any potential risks or impediments that might stand in the way. That’s where the real value of story point estimates lies.

As such, whether story points include complexity, effort, risk, or anything else is an arbitrary decision that ultimately doesn’t matter, except to the extent to which it influences the discussion the team has about each item. Whatever method is chosen, it needs to be applied consistently across stories.

Disclaimer: Aside from stimulating discussion during planning, there may be value in using story points for purposes other than velocity. Say, for example, a team is estimating story points based purely on complexity. When they look at the top of the backlog during planning, they find a story that’s the equivalent of digging a ditch: dead simple, but it will take a long time. Because they recognize that’s the only story they’ll have time to do during the sprint, that’s the only story they chose to pull in. As a result, their velocity tanks that sprint.

The value here may be in how that data point is used down the road. For example, the team might go to management to say ‘You see this dip in the graph here? That’s something you could have outsourced to a less expensive person or group.’ Or the developers might go to the PO and say ‘This kind of story isn’t very well constructed. It’ll keep us busy, but we could create value more quickly if you rearranged the work to leverage our skills more effectively.’

I’ve also been considering whether value might be extracted from story points if they were used as a measure of the width of the ‘cone of uncertainty’ around the estimated time needed to get a story to done. I.e., the bigger the story point value, the less certain it is that a time estimate is accurate. For example, a developer might estimate that the tasks needed to get a story done will take ten hours overall. If the story is rated a one, the developer will very likely get it done in about ten hours. If the story is rated thirteen, it might take ten hours, but it could just as easily take two, or twenty, or fifty. If this postulate turns out to be true, the team could then use the information represented by the story points during a sprint to help decide whether they’re likely to get everything to done. (I’ll be considering this theory in more detail in a future post, when I have more empirical data to throw at it.)