How story points work

One of my clients is a small software development house that does custom development in the form of development projects for clients . I helped them to successfully introduce Agile (Scrum with XP) and both the team and business managers are really happy with it. 

As they liked our methods of planning and estimating (story points and velocity) the account managers and sales team were discussing the options to relate story points to dollar values. 
To explain why I think this is very risky and not advisable I wanted to give them some background.
“How big” vs. “How long”
Story points are units that are used to size a piece of functionality or work. Sizing in this case means that story points indicate “how big” a piece of work is. 
This is often confused with “how long” it takes to implement it but in fact “how big” and “how long” are very different things:  

  • The “how long” is highly dependent on which developer is performing the work 
  • The “how big” bears no relationship to who is performing the work

If we use an analogy of several piles of dirt in a room I can say that one pile of dirt is bigger than the other. However, if the work then consists of moving one pile of dirt from one end of the room to the other by carrying small amounts of dirt on a spade it might turn out that Peter is a lot stronger and faster than Paul and that it would take him 2 hours to move the pile while it would take someone slower and more junior like Paul 6 hours to complete the job. 
Don’t assign work upfront
Therefore, in order to estimate the “how long” at the beginning of a project or iteration we would need to know which programmer will do the work at the time we estimate. This is highly impractical as it would lead to bottlenecks when developers are waiting for each other to finish their pieces of work. It could cause a chain reaction of delays triggering down the project and people being over or under-utilised.
How big?
What we can do on the other hand is to estimate the “how big” part. Using the piles of dirt analogy we can say that a pile of dirt is 40ccm with much more certainty than we can estimate how long it will take someone to move it to the other end of the room – especially if we don’t know who will carry out the work yet. As software functionality is a lot harder to measure than the volume of a pile of dirt we introduced story points. Instead of saying that a piece of work is 40ccm we say that its size is e.g. 3 story points. Essentially we are applying the same principle of measuring size rather than time. 

How long?
We still need to come up with a measure though for how long it will take to move all piles of dirt to the other end of the room, ie how long it will take to finish the project. One way of doing this is to size all known pieces of work and then to perform the work in timeboxed iterations of e.g. 2 weeks. At the end of each iteration we will know how many story points the team have implemented and after a number of iterations we will know that the capacity of the team is roughly X points pr. sprint. 

Using this capacity number (called velocity) we have achieved the following:

  • We can plan ahead in terms of time: If a team normally achieves 10 story points pr iteration for a project with 100 story points we would need 10 sprints. 
  • We can plan ahead in terms of budget: Given the number of resources on a project is constant we know the number of hours in an iteration and therefore the price of a sprint. If we know roughly how many sprints are needed in a project we have an indication of how much it will cost overall.
  • We have managed to make estimates independent of who will be doing the work. Incorporating the “how long” into team capacity rather than the piece of work unit makes estimates independent of future work allocation. 

Sizes are relative
As we, unfortunately, can’t measure the size of software functionality as objectively as piles of dirt we size pieces of work relative to each other. We have no way of saying that a piece of work is e.g. a universal size of 3 which is why we arbitrarily start with some piece of work and assign it a size. All subsequent pieces of work (user stories or features) can then be measured against this base story as being smaller, bigger, much bigger etc. For this we use the Fibonacci number scale (1,2, 3, 5, 8, 13) which helps us size all user stories relative to each other.

Which number we start out with does not really matter as the relative sizing in combination with team velocity is sufficient for us to plan ahead. 
Sizes depend on context
It is important, though to note that story sizes are calibrated against an arbitrary base chosen by the team and that those sizes are by no means absolute or objective. In fact the sizes and velocity are only valid for this particular team and velocity will also change over time as the team increases capacity by getting better at what they’re doing. Also, often functionality implemented by one team is assigned a widely different number of story points from another team implementing the exact same functionality (the same code might be implemented using 30 story points by one team and 140 points by another). 
Overall, this system works well as long as story sizes and team velocity (ie the “how big” and “how long”) remain independent of each other and as long as we keep in mind that story sizes are derived as arbitrary values relative to each other within a project.

Using story points, potentially even across projects, to equate directly to $$ is risky at best.

(Credits for coming up with the “pile of dirt” analogy to Mike Cohn)


Sandy Mamoli
No Comments

Post a Comment