Data science teams are scrambling to update their models in the wake of extreme and unforeseen worldwide changes brought on by the global COVID-19 pandemic. In the face of these unprecedented events, one of the key concerns among many data scientists is that their current models could be generating inaccurate or misleading predictions. In the webinar, AI in Turbulent Times: Navigating Changing Conditions, we outlined the steps that data scientists can take to incorporate robustness into their model building processes.
Regardless of the systemic shock disrupting society and the economy, many organizations have critical functions, such as supply chain management, that leverage machine learning models and still need to operate as smoothly as possible. Distribution managers, for example, are working to ensure store shelves are stocked and deliveries are moving. As data science teams adjust to ever-changing circumstances, so too should the machine learning models they have in production.
Here are some key takeaways from the webinar that data scientists can leverage to account for historic unforeseen circumstances:
Take a Big Step Back
A poll taken at the beginning of the webinar about the confidence level of AI models in production, given the latest changes in the world, was telling. For those with models in production:
41.5% said they were certain some AI models are/will be producing bad predictions
41.5% said they suspect their AI models were producing bad predictions, but were unsure
17% said they have no idea how their AI models are performing
Because machine learning models are based on historic data to predict future events, the COVID-19 pandemic presents a unique challenge for data scientists because there are no similar events for comparison that can inform competent predictions.
A starting point for data scientists is to take a big step back and ask how their models could be impacted. Data science teams should also consider which actions the models are impacting and how operations could change as a result. Travel, retail, and oil prices, for example, have all changed significantly in the past few weeks alone, so it falls to data science teams and their organizations to assess which features are most crucial to their models.
Understand the Changes
After data science teams have evaluated and assessed which features have the greatest impact on their models, they can then consider how evaluating these feature changes will impact their operations. Once these changes are in place, organizations can get a more realistic understanding of where their new business priorities should be as they contend with constantly changing circumstances.
Predicting missed appointments for medical outpatient visits isn't as important right now. On the other hand, building models to appropriately staff hospitals at the department level is currently vital.
Understanding what features the models are considering is essential to making the necessary changes and adjustments. These insights can help data science teams and business leaders pinpoint which operations they should prioritize.
Talk to your Models. Have a Plan.
Amid constantly shifting developments, data science teams need to do more than monitor models. Instead, it is important for data scientists to monitor changes to the models’ input distributions, to consider how data drift could be impacting findings, and understand how these changes affect their overall accuracy.
This requires implementing different thresholds for different models and considering how they should be adjusted in advance. When a model starts to stray from these thresholds, data science teams need to work with relevant parties so that they can respond accordingly. Organizations should develop a monitoring strategy that is individualized for different models and allows for setting data drift or accuracy thresholds. Without this approach, model monitoring becomes like watching water boil and ineffective.
The next step is ensuring the relevant teams are alerted to model changes. After all, a real-time iPhone app model could involve a different team then a weekly customer churn model on a terabyte of data. This coordination ensures that the correct parties are quickly notified of any changes affecting your model.
The COVID-19 pandemic has resulted in significant disruptions to communities worldwide. Understanding how to adjust models to new developments and prepare for even further disruptions will be essential to navigating the uncertain landscape ahead.
Click the banner above for a full webinar recording. NOTE: Network traffic from overwhelming participation in the live webinar caused the audio to cut out at times. Jay and Rajiv will try to answer any questions that were missed.
About the Authors:
Jay Schuren is the General Manager of our Time Series activities at DataRobot. Jay joined DataRobot through the acquisition of Nutonian in 2017, where he led the customer facing data science team and focused predominantly on time series use cases across all industries. Jay has over 10 years of experience and a PhD from Cornell University.
Rajiv Shah is a data scientist at DataRobot, where he works with customers to make and implement predictions. Previously, Rajiv has been part of data science teams at Caterpillar and State Farm. He enjoys data science and spends time mentoring data scientists, speaking at events, and having fun with blog posts. He has a PhD from the University of Illinois at Urbana Champaign.