We're hearing a lot about automated machine learning lately, inspired in part by growing demand and the shortage of data scientists. But like many innovations, automated machine learning did not simply appear out of the blue; it is the product of at least twenty years of development. In this post, we briefly recap this history.
Before Unica Software launched its successful suite of marketing automation software, the company's primary business was predictive analytics, with a particular focus on neural networks. In 1995, Unica introduced Pattern Recognition Workbench (PRW), a software package that used an automated grid search to optimize model tuning for neural networks. Three years later, Unica partnered with Group 1 Software (now owned by Pitney Bowes) to market Model 1, a tool that automated model selection over four different types of predictive models. Rebranded several times, the original PRW product remains as IBM PredictiveInsight, a set of wizards sold as part of IBM’s Enterprise Marketing Management suite.
Two other commercial attempts at automated predictive modeling date from the late 1990s. The first, MarketSwitch, consisted of a solution for marketing offer optimization, which included an embedded “automated” predictive modeling capability. In sales presentations, MarketSwitch provided little information about how this optimization worked. Nevertheless, they touted the "former Soviet rocket scientists" behind the technology, and promised customers they would be able to “fire their SAS programmers.” Experian acquired MarketSwitch in 2004, repositioned the product as a decision engine and replaced the automated modeling capability with its own outsourced analytic services.
KXEN, a company founded in France in 1998, built its analytics engine around an automated modeling technique called structural risk minimization. The original product had a rudimentary user interface, depending instead on API calls from partner applications; more recently, KXEN repositioned itself as an easy-to-use solution for Marketing analytics, which it attempted to sell directly to C-level executives. This effort was modestly successful, leading to sale of the company in 2013 to SAP for an estimated $40 million.
MarketSwitch and KXEN made little headway against conventional predictive analytics. First, they “solved” the problem by defining it narrowly; limiting the scope of the optimization to a few algorithms, they minimized the engineering effort at the expense of model quality and robustness. Second, by positioning their tools as a means to eliminate the need for expert analysts, they alienated the the few people in customer organizations who understood the product well enough to serve as champions.
In the last several years, the leading analytic software vendors (SAS and IBM SPSS) have added automated modeling features to their high-end products. In 2010, SAS introduced SAS Rapid Modeler, an add-in to SAS Enterprise Miner. Rapid Modeler is a set of macros implementing heuristics that handle tasks such as outlier identification, missing value treatment, feature selection and model selection. The user specifies a data set and response measure; Rapid Modeler determines whether the response is continuous or categorical, and uses this information together with other diagnostics to test a range of modeling techniques. The user can control the scope of techniques to test by selecting basic, intermediate or advanced methods. (SAS has recently rebranded this product as SAS Factory Miner).
IBM SPSS Modeler includes a set of automated data preparation features as well as Auto Classifier, Auto Cluster and Auto Numeric nodes. The automated data preparation features perform such tasks as missing value imputation, outlier handling, date and time preparation, basic value screening, binning, and variable recasting. The three modeling nodes enable the user to specify techniques to be included in the test plan, specify model selection rules and set limits on model training.
All of the software products discussed so far are commercially licensed. Reflecting the machine learning community's orientation towards open source software, it's not surprising that some of the most innovative developments in automated machine learning are in community projects. Three projects deserve special mention: caret, Auto-WEKA and AutoML.
The caret package in open source R includes a suite of productivity tools designed to accelerate model specification and tuning for a wide range of techniques. The package includes pre-processing tools to support tasks such as dummy coding, detecting zero variance predictors, identifying correlated predictors as well as tools to support model training and tuning. The training function in caret currently supports 192 different modeling techniques; it supports parameter optimization within a selected technique, but does not optimize across techniques. To implement a test plan with multiple modeling techniques, the user must write an R script to run the required training tasks and capture the results.
Auto-WEKA is another open source project for automated machine learning. First released in 2013, Auto-WEKA is a collaborative project driven by four researchers at the University of British Columbia and Freiburg University. In its current release, Auto-WEKA supports classification problems only. The software selects a learning algorithm from 39 available algorithms, including 2 ensemble methods, 10 meta-methods and 27 base classifiers. Since each classifier has many possible parameter settings, the search space is very large; the developers use Bayesian optimization to solve this problem.
Challenges in Machine Learning (CHALEARN) is a tax-exempt organization supported by the National Science Foundation and commercial sponsors. CHALEARN organizes the annual AutoML challenge, which seeks to build software that automates machine learning for regression and classification. The most recent conference, held in Lille, France in July, 2015, included presentations featuring recent developments in automated machine learning, plus a hackathon.
As automated machine learning matures, there is also a shift in how we describe the capability. Where early commercial products like MarketSwitch and KXEN claimed to eliminate the need for experts, we now think of automated machine learning systems as productivity tools, as instruments to make experts more efficient and effective. Robotic surgery, for example, does not eliminate the need for cardiologists; it enables cardiologists to focus more effort on diagnosis and patient care. In a similar fashion, automated machine learning does not eliminate the expert analyst; it helps the expert analyst focus effort on understanding the business problem and explaining results, the true value drivers for advanced analytics.