Strengthen Your Anti-Money Laundering Program with Automated Machine Learning

There has been tremendous interest in the use of machine learning to improve risk detection and to strengthen risk management programs of all kinds. In particular, Anti-Money Laundering (or AML) is an area where using machine learning to improve the detection of suspicious activity  has been demonstrated to significantly reduce the number of false positive alerts that financial investigation units must comb through in order to find the relatively small number of cases requiring more intense scrutiny.

However, there is another equally beneficial use case that may be less obvious.

In every bank, the AML process begins when accounts are opened. “Know Your Customer” (KYC) rules require that banks do their due diligence in confirming the customer’s identity, address, and intended use of bank products as they open an account. Banks must determine which customers require enhanced due diligence because they pose a higher risk for money laundering based on the information gathered.  

Depending on the type of customer, their location, and the products, deciding which questions to ask new customers can get complex. For an organization comprised of multiple legal entities engaged in international business, the number of questions the banks may ask can be intimidating to both bankers and customers.

To make matters more difficult, regulators do not provide explicit guidance on either the nature of questions or the number of questions to ask in each situation. In the recently released FFIEC BSA/AML Examination Procedures Manual, examiners are advised that:


"The assessment of customer risk factors is bank-specific, and a conclusion regarding the customer risk profile should be based on a consideration of all pertinent customer information...”


So, how do you know what the right questions are for the KYC process? Or, put differently, what are the best predictors of money laundering risk?

There is balance to strike between the most exhaustive due diligence imaginable and a minimalist approach which aims to inconvenience the client as little as possible. Banks must align costs, complexity, client experience, and predictive value in the questions and thresholds they set in order to develop the optimal KYC process.  Some banks even create dedicated “enhanced due diligence” units to handle high risk clients in the event that the answers to a small number of questions indicates a need for a more thorough vetting process.

Banks often rely on experience and expert opinion to determine which questions are most relevant, are indicative of money laundering risk, and are appropriate for the type of client, geography, product set, and intended usage. But, experience and expert opinion may not be sufficient to achieve the best result, and justification may be difficult as opinions vary.

Fortunately, banks have a valuable asset which can be leveraged for their KYC efforts – their own data. Banks collect data on all their clients, have voluminous transaction details, track suspicious activity alerts, and know when Suspicious Activity Reports (SARs) have been filed with the Financial Crimes Enforcement Network (FinCEN).

Based on known outcomes – historical data containing transactional patterns which resulted in a suspicious activity warning, investigation, and ultimately whether or not a SAR was filed – banks can use the DataRobot automated machine learning platform to identify which customer data and transaction activity are indicative of a high risk for potential money laundering.  

Screen Shot 2018-05-29 at 11.54.01 AM

DataRobot offers features that let you interpret which factors impact the model most.

DataRobot accomplishes this by testing the bank’s data against dozens of models, with the best-performing models providing valuable insights that help identify which customer data and transactional patterns are best at predicting potentially suspicious activities. This knowledge helps banks identify the right questions to ask during the KYC process and can help tune suspicious activity monitoring detection rules. Questions based on less predictive factors may not add value and could be eliminated from KYC and due diligence, saving time and reducing complexity.  

For factors which are significant predictors of potential money laundering activity, use of DataRobot’s partial dependence features reveal the thresholds at which these predictor values contribute strongly to correctly predicting money laundering risk, and could be used to determine which clients are subject to enhanced due diligence and which are not.


Partial Dependence visualizations help banks determine thresholds for risk.

This is an example of how building a model can help you understand what your data is telling you.  In this case, your own data is the best source of knowledge on the right questions to include in your due diligence process. Decisions based on actual experience will be easier to explain and to justify. In addition, DataRobot’s model transparency and model risk management features can be used to create the documentation that regulators will expect.

As part of AML examination procedures, examiners are told:

"The bank should identify the specific risks of the customer or category of customers, and then conduct an analysis of all pertinent information in order to develop the customer’s risk profile."


Using DataRobot and your own data to understand risk factors and identify strong predictors of high risk activity can aid tremendously in the design and calibration of your KYC process.  Ongoing monitoring with DataRobot will help you detect when risk factors or transactional patterns have changed and where recalibration of your KYC process may be in order.

Using DataRobot in designing the KYC process helps banks strike the right balance between cost, client experience, and regulatory expectations while providing banks with valuable documentation to demonstrate rigorous KYC design based on actual bank-specific experience.

 New call-to-action


About the Author:

H.P. Bunaes leads the banking practice at DataRobot, helping banks leverage AI and machine learning for predictive analytics and data mining. H.P. has 35 years experience in banking, with broad banking domain knowledge and deep expertise in data and analytics. Prior to joining DataRobot, H.P. held a variety of leadership positions at SunTrust, including leading the design and development of the risk data and analytics platform used enterprise wide for risk management.  H.P. is a graduate of the Massachusetts Institute of Technology where he earned a Masters Degree in Management Information Systems, and of Trinity College where he earned a Bachelor of Science degree in Computer Science and Mechanical Engineering.