Monte Carlo Forecasting Awesomeness!

kuoman
Dec 20, 2023
5 min read

Updated: Dec 21, 2023

As a developer, I'm dedicated to maximizing my hands-on keyboard development time, while in my role as a manager/director, my focus is on minimizing unnecessary waste across the team. I've previously discussed the effectiveness of Discovery Trees as a simple yet powerful method for making discoveries, enabling just-in-time planning, and enhancing work visibility. These hierarchical trees adhere to straightforward rules, originating from a root work item and branching out to encompass associated tasks, consistently aiming for the "next simplest thing" to advance the overarching goal. This approach has proven instrumental in optimizing work efficiency, particularly concerning epics, stories, tasks, and iteration planning within Agile frameworks.

Having addressed the item level, the logical progression involves tackling the challenges of estimation and forecasting. Numerous studies underscore the inherent human difficulty in estimating tasks beyond an hour, or at most, a day. Faced with this reality, we are presented with a choice: persist in the time-consuming pursuit of refining our estimating skills or explore more effective alternatives. I advocate for the latter and propose a comprehensive process that integrates Story Maps, Discovery Trees, and Monte Carlo estimation.

Story Maps serve as a valuable tool for outlining a rough yet sufficient skeletal structure or overall goal. Discovery Trees, in turn, breathe life into items from the Story Map, offering visibility into ongoing work and revealing insights as they emerge. The culmination of this process is found in Monte Carlo estimation. By leveraging data from Story Maps and Discovery Trees, Monte Carlo estimation catalyzes valuable insights and meaningful discussions.

The Monte Carlo estimation simulator takes your data, runs it through statistical simulations (a lot of them), and generates confidence intervals. Troy Magennis at FocusedObjective has popularized this method. He provides classes and this spreadsheet to folks to help them leverage this simulation for the power of good. I was introduced to this only recently and was very skeptical. However, I tried it and was amazed at its insights, accuracy, and the discussions that it helped to foster. I’m going to dive into some details that are found on different tabs of the above-linked spreadsheet. I’ll try to keep it high level.

On the spreadsheet, the Forecast tab is where a lot of the action happens.

Your team data gets entered in on the Forecast tab item #2. The Stories Remaining to be completed count can come from either the Story Map or completed Discovery Trees. You can then set the Scope Complexity to represent your confidence in how the stories meet the complexity of the state of the story map.

The #3 Split Details, similar to the “stories remaining” are tracked values, and I derive them from how many tabs/items/nodes are in each branch of a Discovery Tree. I have tried using just the remaining stories here and it works ok. I prefer the fidelity of getting finer-grained data with a little more effort tracking remaining branch nodes on trees.

Throughput, #4 on the sheet, can either be Data (weekly tracked value by you that evolves) or Estimate (generic industry standard values that Troy has derived). I prefer to start with the industry standard values and replace/evolve into my own over time (more soon). The final value How Long value is how many of the stories that you have remaining you will accomplish in the cycle about to be done. In my experience, it’s not uncommon for this number to be fairly static. The spreadsheet requires you to keep track of some values from week to week, I call them “tracked values”: How many stories remain (high/low) and split rate high/low for stories. For example, when you start your remaining work item low guess/high guess the number will be the same, say 3. In your next cycle perhaps you accomplished 2 stories and added 5. For that cycle’s forecast, the low will be 3 (from the previous week) and 6 (1 undone and 5 added) from this cycle. The third time you run your forecast you will run with 3/6 unless you have a new low or high.

For my #4 Throughput data, I prefer to use Data. I track this number and use the Historical Samples tab to substitute numbers in each week replacing the existing values as I get them. Eventually, this dataset becomes a better representation for the team. Although this number could be thrown off for any reason, I keep it real and don’t adjust it, regardless of how ugly or awesome it is.

The Results section is the pay dirt here! This is why we are doing all this. Every time you enter and modify any of the values noted above the spreadsheet will trigger thousands of simulations to be run and your Results values will change. The Results view breaks down and gives you confidence intervals on when the remaining work will be accomplished by. You can pick the confidence interval you would like to use, most places use the 85%. It’s a forecast and a weather prediction out for as long as you can see. The date, in my opinion, is data based on a guess. It can, however, give you an indication of issues over time. I have been on projects where that 85% date kept getting further and further out every iteration. We had a product that needed to

be shipped in 4 months. After 4 weeks of cycles and growing every cycle, it was showing 85% confidence delivery at 8 months. This triggered a meeting, after 1 month on the project, and it was determined that a lot of it was startup cost, not to worry, and everyone check back in 2 more cycles. At that meeting, we were still 8 months from delivery for a date that was a hard due in 2.5 months. Discussions were had, features were cut/scaled back, “wouldn’t it be nice if’s” were curtailed, and unnecessary planning for future functionality was stopped. We hit our date early and got to add in some fun things, all because we had data to make smart choices early.

Monte Carlo Forecasting stands out as an exceptional tool, demonstrating its effectiveness when employed thoughtfully and with due respect. It's not only widely available in various popular work-tracking software packages but has become a cornerstone in my belief in continuous improvement, both in processes and software. My focus revolves around maximizing hands-on-keyboard development time, minimizing meeting durations, and enhancing communication and visibility. Currently, my preferred arsenal includes Story Maps, Discovery Trees, and Monte Carlo Forecasting, a combination that, in my experience, surpasses the accuracy achieved by any other planning model, framework, or practice I've explored.

Addressing the blog's initial emphasis on human challenges in estimating and forecasting, the methodology I've outlined serves as an effective antidote. It has provided me with early warnings of potential issues and empowered the teams I work with to confidently meet deadlines without engaging in the time-consuming process of estimation. This approach is not just about avoiding tasks where human proficiency is lacking; it's about allowing the data to speak for itself, eliminating the need for constant estimation refinement, and fostering a culture where decisions are grounded in factual information.