Lessons learned from forecasting
I’ve recently been tackling a few forecasting problems. Reviewing historic data to find patterns and predict what might be next. Here’s what I discovered.
Why would we want to?
A question you should always ask when starting any problem is why am I doing this? For forecasting, there are several important reasons. Planning for the future, reducing uncertainty and pivoting when things may not be going well, are all good examples of general business cases where some form of forecasting can help. Intuitively we all forecast every day. Through creating a model and using time-series forecasting techniques, you can add more data and scientific rigour to this process, (hopefully) making it more reliable.
One of the business problems I have been investigating is trying to find insights into holiday patterns across the business. This involved looking at aggregated data from when people use their holidays, with the ultimate goal of predicting how many people would take a holiday in a given period and assist in resource scheduling.
What I learned
First, historic data only gets you so far. Looking at the seasonal variation and trends within a dataset can reveal clear and predictable outcomes that can be very helpful if you expect these patterns to continue. However, if you want a reason for why things might happen in the future or how they might change, you need to have explanatory data for why it happened in the past.
The advice I have on this is to speak to the domain experts and work together to find out what these might be. In the case of holiday modelling, the dates of local school holidays showed clear increases in the number of holidays booked, which helped improve the accuracy of the model, but more insight would have been helpful.
Second, a forecast is only ever an educated guess. Things can change for many unexpected reasons and there is no reason to believe that a forecast should play out exactly as predicted. How “educated” this guess is depends on a large number of factors like how good your data is, how consistent is the system you are modelling and most likely some simple random chance. This is why it is important to include some form of uncertainty measure in any model you produce to allow for some wiggle room and assist in planning for possible outcomes. This also gives more reason to add explanatory variables to your model, as changes in these may account for some of the larger changes that could happen in the real world.
Next, some insights are obvious. I came away from absence modelling knowing two key things: people are hard to predict and most holidays are taken during the summer and around Christmas (and I'm sure this will also highly depend on when holiday rollover dates are). I'm confident anyone could have told me this before I started the project. While the forecasts I had made were accurate, they also needed to be useful to others.
This leads onto my final point, it can be easy to get lost. This one likely applies to problem solving in general but is useful advice regardless. When you are given a large amount of data and told “see what you can find”, you may come up with some very interesting new ideas, or you may get lost in the details. To help with this, it is important to have a predefined problem you are trying to solve, a goal you can head towards, to keep you on track. Ultimately, you want to be able to produce an output that can be useful to someone, otherwise what was the point?
Forecasting is such a common business problem, and now that I have some more experience in the area, I expect it won't be too long before I'm looking at this again. For the next project, however, the key thing I will focus on will be structuring the problem from the start. In research, it's important to go in with as open a mind as possible. By ensuring your inherent biases don't corrupt your outcomes, you never know what you might find! However, just looking to see what you can find often leads to falling down rabbit holes and forgetting what you wanted to know in the first place. Know what questions you are trying to answer, get the resources and domain expertise you need to understand the problem and work towards a solution in measured steps (while still keeping your eye out for interesting rabbit holes).