Time Series & Regression Supply Forecasting Model
2021 · Time Series · Regression · Our World in Data
Impact
- 15% lower MAPE than the incumbent projections, which had systematic upward bias from stakeholders hedging against shortage risk
- Presented results to the Chief Procurement Officer; the company implemented the model across multiple supply categories
- Received 8.5/10 for the research and results; the company started investing in scaling the approach
- Built for one category; designed from the start to scale, and extended to additional categories with the analytics department
- Confidence intervals were integrated directly into the stochastic supply allocation model
Business Problem
The client was systematically overforecasting supply needs across multiple product categories. The motive was understandable: avoid any shortage risk. But the result was excess inventory across the supply chain, which is expensive.
The harder challenge was that the business believed specific events would meaningfully impact demand, and wanted those effects modelled in isolation so they could run scenario simulations. A simple time series forecast was not enough.
Solution Design
A hybrid forecasting approach combining time series models with a regression-based event impact layer. Historical supply delivery data was processed through exponential smoothing for the base trend, then passed through a regression layer that modelled event effects separately: COVID-19 restrictions using external signals from Our World in Data and Google Maps mobility data, and internal promotions from the calendar.
Output was supply forecasts with confidence intervals, fed into a stochastic supply allocation model and used directly for scenario simulation per event type.
Technical Challenges
Regime shifts from COVID. Standard time series models assume stationarity. Lockdowns and reopenings created sudden, non-gradual breaks in the delivery series that the time series component alone couldn't anticipate. The regression layer was added specifically to handle these: external signals like hospital admissions and mobility data acted as leading indicators the time series component couldn't see on its own.
Isolating event effects. Promotions, COVID restrictions, and macro trends all move the series simultaneously. Separating each effect required careful feature design in the regression layer to avoid collinearity and ensure the isolated impacts were interpretable for scenario simulation. If the features weren't independent, the scenario outputs became meaningless.
Overforecasting as the baseline. The existing projections had systematic upward bias as a hedge against shortage risk. Evaluating model quality meant accounting for this: the target wasn't simply matching historical actuals but producing forecasts that were materially closer to actuals than an incumbent approach designed to be deliberately conservative.
Status
- Research project delivered in 2021, received 8.5/10 from the client
- Deployed for one category; extended to additional categories in collaboration with the analytics department
- Confidence interval output handed off to the stochastic supply allocation model as a one-off integration
Next Steps
- Live CI integration: replace the one-off handoff with a live input to the stochastic supply allocation model so forecast updates automatically propagate to allocation decisions