Edgeware Agile, part 2: Data-Driven Long Term Forecasting

In a previous blog post, we described Edgeware’s agile transformation three months into the process. We’re now five months in, and since the last blog post, we’ve been constantly improving our ability to plan and deliver. We are now getting results in terms of shorter lead times, better planning flexibility and improving velocity trends for the organization.

One challenging area has been forecasting. Edgeware operates in a business-to-business setting where TV operators plan and make large investments 3-6 months into the future. We need to provide customers with accurate delivery dates on critical features.

Data-driven forecasting

Back in September when we planned the last quarter, there was limited historical data on the throughput of the new organization (e.g. features completed per sprint). This made forecasting difficult. To get accurate forecasting, we wanted a fully data driven approach.

Our unit of long-term planning is what we call a “feature”. This is a self-contained deliverable with clear value, either internally or for a customer. The idea is to make all features as small as possible so that they essentially become the “same” size, e.g. a team sprint or so. Some will be bigger and some will be smaller, but if we have enough of them, deviations will average out. Estimating the size of very large things is difficult. It is much easier to break them down into similar sized items and count them.

‘During the past five months, we have been gathering historical throughput data – that is, features completed per sprint for the whole org. The throughput data makes it possible to estimate the amount of work we can commit to in the five or six sprints in a quarter (two weeks per sprint). This, together with an ordered feature backlog, allows us to predict what will be completed by the end of a quarter.

Historical throughput

Figure 1 shows what our historical throughput looks like. The bar chart depicts the number of features completed per sprint. The improving trend line is a linear regression. It suggests that for every sprint, we can deliver 0.4 more features than the previous sprint. The horizontal line is the average.

Since the linear regression and the average are two different ways to model reality, the true trend line is probably somewhere in between. As we gather more data, the linear regression line will converge with our true velocity trend.

Forecasting accuracy

Another interesting thing with historical data is that we can use it to assess the accuracy of our forecast. Since we record the lead times of completed features, we have tried simulations where we randomly select different mixes of past features. From such a random mix, we can see if we would complete the scope in time for the coming quarter. Sometimes randomly all features are on the big side and we miss the scope.

Running thousands of simulations gives us an estimate in per cent on how often we might miss. We can then dimension the amount of “must have” features to give us reasonable probability to succeed and top up to capacity with “nice to have” features.

Challenges

The accuracy of the forecast is heavily influenced by the size of the features. Still today, feature size varies a lot, and ranges from half a sprint (sometimes) to a sprint (often) to up to three sprints (rarely). This makes forecasting difficult, and especially so if we want to forecast delivery date and scope for an individual customer with a specific set of features. The smaller scope increases the influence of variance.

What we want to achieve is a process where we use reference features from the past to estimate whether or not a new feature of unknown size is “good enough” or “too big”. Of course, this will require the usual investigation work by R&D and product owners. If the feature is “too big” or if it is uncertain if it is really “good enough”, it should be split.

However, splitting features is hard! We have started to develop guidelines around slicing, a technique where we divide a feature into smaller parts that individually bring value. For example, questions to consider when you do slicing could be: Can the feature be split according to

different user types? (admin/customer/end user, geographical area, premium user etc)
different interfaces/devices? (Web UI/REST API/programmable API/CLI, iPhone/Android etc)
different objects/variants? (receiving file transfer/sending file, different input/output video formats etc)
different levels of non-functional requirements? (single transaction/heavy load, single server/distributed system etc)
different functional requirements? (API only/bare minimum/good enough/bells and whistles, configurability, automation level, upgrade paths, compatibility etc)
…and so forth.

Actual predictions

Today, from the data, we predict that we have a 90 % probability to deliver 20 features (yellow line in Figure 1) in the five sprints of this quarter and a 25% probability to deliver 30 features (green line). The probabilities change as we complete sprints and get better accuracy on our true throughput.

Since this quarter is the first in which we use a fully data driven forecasting approach, we will have to come back with a later blog post on outcomes. Until then, we will continue improving the way we work and refining our processes. If you want to be a part of and contribute to the Edgeware agile transformation, please have a look at our open positions!

Author:

Johnny Bigert
VP R&D

LET'S MAKE TV AMAZING AGAIN

Do you want to know more? Fill out the form below and we will get in touch with you.