Sometimes average is good.

The Big Estimation Debate
 
There have always been, and no doubt always will be debate over estimation; the pros and cons, behaviour anti-patterns and the how and why of it all.
 

In this post I am not going get into that debate but I want to share with you some experiences I have had with estimation through using the average number of story points as an aid to mid-term planning.
 
Teams often need to understand when a release longer than a couple of sprints might be completed. Personally I don’t think it is unreasonable to ask “could you give us an indication of when you may be finished so that we can tee up all the other elements of the go-live process”. As software is after all just an element of the whole big picture.
 
Faced with this question, what kind of options are there?
 

  • Tell them “go away we don’t do estimates here”
    • Not very constructive for anyone
  •  

  • Break down every theme and epic into their individual stories, hold loads of sizing sessions and just figure something out
    • The just-in-time high value conversations that happen with sizing would get diluted by “what did we say about that story we sized last week again?”
    • It does not really allow for the solution to emerge and starts to have the feel of Big Design Up Front
    • It takes the focus away from producing working software
  •  

  • Only size the stories for the upcoming sprint
    • Just-in-time and just-when-they-are-needed clear communications
    • Pretty much the same as “go away” in terms of being helpful to the big picture
  •  

  • Switch to Kanban and use measurements like throughput and lead time etc
    • Potentially a significant change to a team just to get some upfront measurement
    • You need some time to have passed for the data to get to a reliable volume for reasonable measurements
    • Still doesn’t solve the need for breaking down epics.

 
Here are a couple of options that have worked for me:
 
Scenario 1:

You have a reasonably stable backlog (reasonable for me was that you could probably guess the stories for the next 3 sprints and be 80% right, this is a reflection of a more well understood product or an increment of an existing well known product).
 

  • Break the release into what you currently understand to be the component stories (Themes into Epics, Epics into Stories).
  • Plan and size your first sprint as you normally would.
  • Take the average (mean) of the story points for that sprint and apply that to all the stories in the backlog. Hot Tip: If the number is a whole number and one that is generally used for sizing , flag it in some way to show that the team has not actually talked about the story. All I do is add 0.1 to the size of the story but feel free to use whatever you like.
  • Use what ever forward looking tools you have (spreadsheet, greenhopper, rally etc) to get your initial “when”. It’s up to you when you start communicating this to the wider audience, my preference is to hold off for as long as possible.
  • Plan and size your second sprint, now calculate the mean average of the two sprints, as the sample size increases so does the accuracy. Apply the new score (don’t forget the point one if you need it) to the remainder.
  • Look at your projection tools and understand what impact there is likely to be, and therefore communication you may need to do, based on the new figures.
  • Repeat until you are done.

A by-product of this approach is that you can see if your sizing estimates are shifting over time, keep an eye out for any significant swings as they could be an indicator of a problem.
 
Scenario 2:

Less stable or less well understood work
 

  • Give your epics a t-shirt size (s,m,l).
  • Plan and size your first sprint as you normally would.
  • Calculate the mean average of the stories you decided to build.
  • Try and break down 2-3 epics within each size category and apply the average story points to them.
  • Take the average of these for the epics and apply that to the remainder of your backlog.
  • Repeat as before for each sprint, trying to break down a few more epics as and when you are able.

This method is obviously more rough and ready than for a more stable system but trying to get stability in a rapidly changing system is not the point, putting in a system that copes with the variation is what you’re after.

Both of the approaches outlined above have helped me to understand the “when” question and to be able to provide a reasonable answer. In addition there are these benefits:

  • You don’t have to do too much effort up front, let story sizing happen when it needs to
  • You are adjusting the data based on rapid feedback loops
  • You get a chance to see if your sizes are shifting over time

 
Being average is not always a bad thing, it can save you time and help you avoid some of the anguish that comes with estimation.
 

Post Tags:
Mike Lowery
2 Comments
  • Jeremy Morris
    Reply

    Hi Mike,

    Nice article! Are you able to offer any advice on how scenario 2 might work for a tribe (4-5 scrum teams in our case).

    Points/velocity are different for each team (as expected).

    Thank you

    Jeremy

    December 4, 2014 at 4:04 pm
    • Mike
      Reply

      Hi Jeremy,
      If the teams in the tribe work on the same product then it’s likely that t-shirt sizes are rough enough that for the most part they could agree on them, or you could do t-shirt sizing with a cross section from all of the teams that way you get the various biases from each team included. Assuming that this worked (if they don’t agree on sizing and the teams work on the same product you may have vision / buy-in issues). you could get the teams to do a breakdown on a few stories and maybe get each team to do the same ones that way you could get a better “average”.

      December 18, 2014 at 7:41 am

Post a Comment