Building model resilience in data analytics

In this Normal Deviance column Hugh Miller draws parallels between the summer of extremes seen in Australia and the need to think about resilience when doing data analytics work.

Nature has shown its fierce potential this summer – a wretched and tragic bushfire season, capped off by massive rainfalls and a global coronavirus health emergency.

Sign of the times – from @theamandarose on Twitter

The direct human and physical costs of the bushfires, extreme weather and coronavirus are obviously the most important thing to worry about in the short-term. If you haven’t donated, it’s not too late to support the RFS or agencies working with suffering communities.

But many data analysts will have arrived back from summer break to discover models suffering from their own sort of shocks. In many cases, finely honed data analytics have given way to volatile experience and crude overrides to manage unhelpful model outputs. Think about:

  • An airline or hotel business that has detailed optimisation models for customer demand, trading off price and volume to maximise profit. A massive drop in demand moves the model well into uncharted territory.
  • A national retailer running real-time FMCG demand and elasticity models, now experiencing stock shortages plus unusual spikes in demand (e.g. face masks) or very low demand for particular products.
  • A government attempting to measure the impact of employment programs in an area where many businesses have shut down for various lengths of time.

All these cases use models to measure and manage quite detailed effects that rely on an underlying stability of the system. Remove that stability and the modelling quickly becomes a challenge.

How can we prevent our models turning to mush? While it’s hard to give general-purpose advice on building model resilience, here’s a few thoughts:

  • Model interdependency: Complex model setups, where multiple models feed into each other, are much more difficult to manage when things go wrong. This is analogous to systems dependency more broadly – heavily integrated systems need much more redundancy and testing.
  • Model reversion and update pausing: For ‘online’ models that regulate react and update as new data rolls in, there is the chance that unusual inputs begin to skew the model outputs too. This can be difficult to detect without a good monitoring process. Further, an appropriate solution may be to pause the updating process or to exclude batches of recent data from the model update. Having this functionality built into the model from the outset makes quick transitions smoother.
  • Sensible defaults and fallbacks: Most models give ‘good’ estimates when the current experience is close to historical levels but will be less reliable for outliers at the edge of past experience. Edge-case testing can be used to understand how sensible model outputs are in the case of unusual circumstances and checking that the resulting recommendations are reasonable. If a situation is sufficiently uncertain, it can make sense to have a simpler fallback model that will be more robust.

Appropriate model resilience depends on the underlying model use and complexity – but for important models there is no reason why we can’t be building in resilience today, to help for next time.

CPD: Actuaries Institute Members can claim two CPD points for every hour of reading articles on Actuaries Digital.