How Can I Improve My Forecast Accuracy?

Imagine

Imagine that Amanda (a completely imaginary person) is a demand planner at Kool Komfort Foods (a completely imaginary company, also branded as K2), a nationwide producer of healthy comfort foods.  She got her bachelor’s in Mechanical Engineering about five years ago.  After a short stint in manufacturing process engineering in another industry, she got interested in the business side of things and moved into supply chain planning, starting in demand planning.  After taking a couple of on-line courses and getting an APICS certification, she seized on the opportunity to be a junior demand planner at K2.  Through her affinity for math and her attention to detail, Amanda earned a couple of promotions and is now a senior demand planner.  At present, she currently manages a couple of product lines, but has her sights set on becoming a demand planning manager and mentoring others.

Amanda has been using some of the common metrics for forecast accuracy, including MAPE (mean absolute percentage error) and weighted MAPE, but the statistical forecast doesn’t seem to improve and the qualitative inputs from marketing and sales are hit or miss.  Her colleague, Jamison, uses the standard deviation of forecast error to plan for safety stock, but there are still a lot of inventory shortages and overages.  

Amanda has heard her VP, Dmitry, present to other department heads how good the forecast is, but when he does that, he uses aggregate measures and struggles when he is asked to explain why order fill rate is not improving, if the forecast is so good.

Amanda wonders what is preventing her from getting better results at the product/DC level, where it counts.  She would love to have it at the product/customer or product/store level, but she knows that she will need better results at the product/DC level before she can do that.  She is running out of explanations for her boss and the supply planning team.  She has been using some basic forecasting techniques that she has programmed into Excel, like single and double exponential smoothing as well as moving average, and even linear regression.  She is sure the math is correct, but the results have been disappointing. 

Amanda’s company just bought a commercial forecasting package.  She was hoping that would help.  It is supposed to run a bunch of models and select the best one and optimize the parameters, but so far, the simpler models perform the best and are no better – and sometimes worse – than her Excel spreadsheet.

Amanda has been seeing a lot of posts on LinkedIn about “AI”.  She has been musing to herself about whether there is some magic bullet in that space that might deliver better results.  But, she hasn’t had time to learn much about the details of that kind of modeling.  In fact, she finds it all a bit overwhelming, with all of the hype around the topic.

And, anyway, forecasts will always be wrong, they will always change, and the demand planner will always take the blame.  Investments in forecasting will inevitably reach diminishing returns, but for every improvement in forecast accuracy, there are cascading benefits through the supply chain and improvements in customer service.  So, what can Amanda and her company do to make sure they are making the most of the opportunity to anticipate market requirements without overinvesting and losing focus on the crucial importance of developing an ever more responsive value network to meet constantly changing customer requirements?

Unfortunately, there really is no “silver bullet” for forecasting, no matter how many hyperbolic adjectives are used by a software firm in their pitch.  That is not to say that a software package can’t be useful, but you need to really understand what you need and why before you go shopping.  

Demand planning consists of both quantitative and a qualitative analysis.  Since the quantitative input can be formulated and automated (not that it’s easy or quick), it can be used for calculating and updating a probabilistic range for anticipated demand over time. 

A good quantitative forecast requires hard work and skilled analysis.  Creating the best possible quantitative forecast (without reaching diminishing returns) will provide a better foundation for, and even improve, qualitative input from marketing, sales, and others.

Profiling

One of the first things you need to do is understand the behavior of the data.  This requires profiling the demand by product and location (either shipping plant/DC or customer location – let’s call that a SKU for ease of reference) with respect to volume and variability in order to determine the appropriate modeling approach.  For example, a basic approach is as follows: High volume, low variability SKU’s will be easy to mathematically forecast and may be suited for lean replenishment techniques.  

  • Low volume, low variability items may be best suited for simple reorder point replenishment.
  • High volume, high variability SKU’s will be difficult to forecast and may require a sophisticated approach to safety stock planning.
  • Low volume, high variability SKU’s may require a thoughtful postponement approach, resulting in an assemble or make-to-order process.  
  • A more sophisticated approach would involve the use of a machine learning for classification that might find clusters of demand along more dimensions.

Profiling analysis can be complemented nicely by a Quantitative Reasonability Range Check (see below), which should be an on-going part of your forecasting process.

Once you have profiled the data, you can start to develop the quantitative forecast, but you will need to consider the questions:

  1. What is the appropriate level of aggregation for forecasting?
  2. What forecast lag should I use?
  3. How frequently should I forecast?
  4. What are the appropriate quantitative forecast models?
  5. How should I initialize the settings for model parameters?
  6. How should I consume the forecast?
  7. How will I compensate for demand that I couldn’t capture?
  8. What metrics should I use to measure forecast accuracy?

Let’s consider each of these questions, in turn.

A. Level of Aggregation

The point of this analysis is to determine which of the following approaches will provide you with the best results:

  • Forecasting at the lowest level and then aggregating up
  • Forecasting at a high level and just disaggregating down
  • Forecasting at a mid-level and aggregating up and, also, disaggregating down

B. Correct Lag

If you forecast today for the demand you expect tomorrow, you should be pretty accurate because you will have the most information possible, prior to actually receiving orders.  The problem with this is obvious.  You can’t to react to this forecast (which will change each day up until you start taking orders for the period you are forecasting) by redistributing or manufacturing product because that takes some time.

Since you cannot procure raw materials, manufacture, pack, or distribute instantly, the “lead time” for these activities needs to be taken into account.  So, you need to have a forecast lag.  For example, if you need a month to respond to a change in demand, then, you would need to forecast this month for next month.  You can continue to forecast next month’s demand as you move through this month, but it’s unlikely you will be able to react, so when you measure forecast accuracy, you need to measure it at the appropriate lag.

C. Frequency

Should you generate a new forecast every day? Every week?  Or, just once a month?  This largely depends on when you can get meaningful updates to your forecast inputs such as sales orders, shipment history, or updates to industry and any syndicated or customer data (whether leading or trailing indicators) that are used in your quantitative forecast.

D. Appropriate Forecasting Model(s)

So, what mathematical model should you use?  This is a key question, but as you can see, certainly not the only one.

The mathematical approach can depend on many factors, including, but not limited to, the following:

  • Profiling (discussed above)
  • Available and meaningful trailing and leading indicators
  • Amount of history needed for the model vs. history that’s still relevant
  • Forecasting a distribution of demand vs. forecasting the actual distribution 
  • Explainability vs. accuracy of the model
  • The appearance of accuracy vs. useful accuracy (overfitting a model to the past)
  • Treatment of qualitative data (e.g., geography, holiday weekends, home football game, etc.)

A skilled data scientist can be a huge help.  A plethora of techniques is available, but a powerful machine learning (or other) technique can be like a sharp power tool.  You need to know what you’re doing and how to avoid hurting yourself.

E. Initializing the Steady State Settings for Parameters

Failure to properly initialize the parameters of a statistical model can cause it to underachieve.  In the case of Holt-Winters 3 parameter smoothing, for example, the modeler needs to have control over how much history is used for initializing the parameters.  If too little history is used, then forecasts will likely be very unreliable. 

When it comes to machine learning, there are two kinds of parameters – hyperparameters and model parameters.  Training can optimize the model parameters, but knowledge, experience and care are required to select techniques that are likely to help and to set the hyper parameters for running models that will give you good results.

F. Forecast Consumption Rules

There are a few things to consider when you consume the forecast with orders.  For example, you might want to bring forward previously unfulfilled forecasts (or underconsumption) from the previous period(s), or there may be a business reason to simply treat consumption in each week or month in isolation.

You may want to calculate the forecast consumption more frequently than you generate a new forecast.

G. Compensating for Demand You Couldn’t Capture

This is a particular challenge in the retail and CPG industries.  In CPG, many orders from retail customers are placed and fulfilled on a “fill or kill” basis.  The CPG firm fulfills what it can with the inventory on hand and then cancels or “kills” the rest of the order.

In retail, a consumer may simply go to a competitor or order online if the slot for the product on the shelf in a given store is empty.

In either case, sales or shipment history will under-represent true demand for that period.  If you don’t accurately compensate for this, your history will likely drive your forecast model to under-forecast.

H. Metrics and Measurement

There are many measures of forecast accuracy that can be used.  A couple of key questions to answer include the following:

  1. Who is the audience and what is their interest?  Consider the sales organization which is interested in an aggregate measure of sales against their sales target, perhaps by sales group or geography.  On the other hand, customer service doesn’t really happen in aggregate.  If you want to have better customer service, you need to look at forecast accuracy at the SKU level.
  2. Are you measuring forecast error based on an assumed normal distribution that you have defined by projecting a mean and standard deviation?  Or, have you been able to use the actual distribution of forecast error, perhaps created through bootstrapping? 

Remember that you will need to measure forecast error at the correct lag.

Another thing you may need to keep in mind is that not everyone has been trained to understand forecast error and its interrelationship to inventory, safety stock, and fill rate.  You may have a bit of education to do from time to time, even for executives.

Price & Forecast

In most cases, demand is elastic with respect to price.  In other words, there is a relationship between what you charge for something and the demand for it.  This is why consumer packaged goods companies run promotions and fund promotions with retailers, and also, why retailers run their own promotions.  The goal is to grow sales without losing money and/or gain market share (possibly, incurring a short-term loss).  The overall goal is to increase gross margin in a given time period.  Many CPG companies make competing products – think of shampoo or beverages, or even automobiles or car batteries.  And, of course, retailers sell products from their CPG suppliers that compete for shelf space and share of wallet.  Many retailers even sell their own private label goods.  The trick is how to price competing products such that you gain sales and margin over the set of products.  

Just as in forecasting demand, there are both quantitative and qualitative approaches to optimizing pricing decisions which, then, in turn, need to be incorporated into the demand forecast.  The quantitative approach has two components:

  1. Using ML techniques to predict prie elasticity, considering history, future special events (home football game, holiday weekend, football team in playoffs, etc.), minimum and maximum demand, and perhaps other features.
  2. Optimizing the promotional offers so that margin is maximized.  For this, a mathematical optimization model may be best so that the total investment in promotional discount and allocations of that investment are respected, limits on cannibalization are enforced, and upper limits on demand are considered.

The Quantitative Reasonability Range Check

There is a process that should be part of both your demand planning and your sales and operations planning.  The concept is simple – how do you find the critical few forecasts that require attention, so that planner brainpower is expended on making a difference and not hunting for a place to make a difference?  A Quantitative Forecast Reasonability Range Check (or maybe QRC, for short) accomplishes this perfectly.  If the historical data is not very dense, then a “reasonability range” may need to be calculated through “bootstrapping”, a process of randomly sampling the history to create a more robust distribution.   Once you have this distribution, you can assign a probability to a future forecast and leverage that probability for safety stock planning as well.

At a minimum, a QRC must consider the following components:

  • Every level and combination of the product and geographical hierarchies
  • A quantitative forecast
  • An asymmetric prediction interval over time
  • Metrics for measuring how well a point forecast fits within the prediction interval
  • Tabular and graphical displays that are interactive, intuitive, always available, and current

If you are going to attempt to establish a QRC, then I would suggest five best practices:

Eliminate duplication.  When designing a QRC process (and supporting tools), it is instructive to consider the principles of Occam’s razor as a guide:

– The principle of plurality – Plurality should not be used without necessity

– The principle of parsimony – It is pointless to do with more what can be done with less

These two principles of Occam’s razor are useful because the goal is simply to flag unreasonable forecasts that do not pass a QRC, so that planners can focus their energy on asking critical questions only about those cases.

Minimize human time and effort by automating the math.  Leverage automation and, potentially, even cloud computing power, to deliver results that are self-explanatory and always available, providing an immediately understood context that identifies invalid forecasts. 

Eliminate inconsistent judgments.  By following #1 and #2 above, you avoid inconsistent judgments that vary from planner to planner, from product family to product family, or from region to region.

Reflect reality.  Calculations of upper and lower bounds of the prediction interval should reflect seasonality and cyclical demand in addition to month-to-month variations.  A crucial aspect of respecting reality involves calculating the reasonability range for future demand from what actually happened in the past so that you do not force assumptions of normality onto the prediction interval (this is why bootstrapping can be very helpful).  Among other things, this will allow you to predict the likelihood of over- and under-shipment.

Illustrate business performance, not just forecasting performance with prediction intervals.  The range should be applied, not only from time-period to time-period, but also cumulatively across periods such as months or quarters in the fiscal year.

Summary

Demand planning is both quantitative and qualitative.  In this paper, we have touched on the high points of the best practices for building a good quantitative forecasting foundation for your demand planning process.  In our imaginary case, Amanda still has some work to do, some of which lies outside of her expertise.  She will need to articulate the case for making an investment to improve the quantitative forecast and building a better foundation for qualitative input and a consensus demand planning process.  A relatively small improvement in forecast accuracy can have significant positive bottom and top-line impact.  

Amanda needs to convince her management to invest in a consulting service that will deliver the math, without the hype, and within the context of experience, so that she can answer the key quantitative questions every demand planner faces:

  • What is the profile of my demand data?
  • What is the appropriate level of aggregation for forecasting?
  • What forecast lag should I use?
  • How frequently should I forecast?
  • What are the appropriate quantitative forecast models?
  • How should I initialize the settings for model parameters?
  • How should I consume the forecast?
  • How will I compensate for demand that I couldn’t capture?
  • What metrics should I use to measure forecast accuracy?

About Arnold Mark Wells
Industry, software, and consulting background. I help companies do the things about which I write. If you think it might make sense to explore one of these topics for your organization, I would be delighted to hear from you. I am solely responsible for the content in Supply Chain Action.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.