In this blog post, our partners from CGI describe their approach to fraud detection and anti-money laundering: a combination of machine learning and graph technology.
CGI is one of the largest IT and business process services providers in the world. The company is at the forefront of this change, serving as a leading IT services partner and expert of choice to support our clients’ journeys. We apply our deep industry knowledge and technology expertise to help clients navigate the complexity of digitalization across people, processes and technology.
The challenges of fraud detection for financial services, banking and insurance companies
Banking, insurance and financial services companies are facing a big challenge: detecting and reporting suspicious financial activities hidden in a growing volume of transactional data. In compliance with legal requirements, they also need to strengthen monitoring measures on “unusual” financial transactions.
These factors are changing the fight against fraud in the financial sector. The traditional analytical tools, often backed by relational databases, make it hard or impossible to cross-analyze complex data. Analysts are unable to detect patterns from disparate data-sources or efficiently monitor suspicious activity.
CGI provides its clients with new solutions to effectively detect and prevent fraud.
Combining machine learning and graph technology to detect and investigate fraud in complex data
Linkurious Enterprise helps extract insights from complex connected data and detect suspicious connections or patterns thanks to a visualization interface and an alert dashboard. It’s an intuitive way to investigate large amounts of financial data and uncover fraud or money laundering.
Its alert dashboard lets users implement their own rule-engine. The first step is to define a pattern using a graph query language. Linkurious Enterprise then monitors the data and generate alerts when the pattern is identified. Analysts can inspect the results within the alert dashboard and triage between real cases and false positives.
This system is typically used to implement expert-curated rules, but it can be enhanced with a machine learning approach. For this project, we decided to evaluate the use of machine learning algorithms to reinforce the detection of “unusual” transactions and adapt to fast-evolving fraud schemes.
This approach brings together the advantages of both graph technology and machine learning. The graph approach is well suited to represent and query a large network of financial transactions. Machine learning can be used to better detect complex patterns and draw the attention of investigators on the most suspicious transactions.
The combination of both technologies can really help investigators pinpoint complex patterns that may hide in the large collection of transactions they investigate.
Building a machine learning-based fraud detection application
The starting point for this fraud detection solution was the following question: How can we evaluate the effectiveness of machine learning compared to rule-based algorithms for the detection of unusual transactions?
- The first step (figure 1) was the development of a generator of random transactions (blue) which allows us to inject known fraudulent transactions (green). The generated datasets are stored in Neo4j.
- The second step (figure 2) was the benchmark of machine learning predictions vs. rule-based detection of fraudulent transactions. Here we evaluated the number of false positives (FDR: False Discovery Rate) found by the machine learning approach (green) vs. the rule-based detection (blue). We can show that the machine learning prediction produces less false-positives than the rule-based detection, especially when the known unusual transactions are low. Thus the machine learning approach is more suitable for the detection of rare events such as fraud.
- The third step was the visualization of the predicted transactions. Linkurious was a natural fit since it works with Neo4j without any advanced configuration.
The investigation of suspicious patterns of actions
We investigated two use cases: the detection of potentially fraudulent transactions with supervised learning and the detection of money laundering patterns with an unsupervised machine learning algorithm.
For the first case, past data and known labeled fraudulent transactions were used to train a supervised machine learning algorithm developed in Python. The results of this algorithm were saved in Neo4j with transaction nodes tagged with a risk property (either risky or not risky). The Linkurious Enterprise alert system was set up to detect all nodes with a “risky” property. The users could then use the alert dashboard triage interface to invalidate/validate the predictions.
The users can inspect each individual result. With the Linkurious Enterprise investigation interface and their domain expertise, they can identify the suspicious cases. The user input, in the form of validated/invalidated alerts, is collected through the Linkurious Enterprise API and fed again into the machine learning algorithm for more rounds of training. The solution is a self-learning process that becomes better at predicting “unusual” transactions, combining human intelligence and machine learning.
Compared to the traditional approach, there are fewer unusual transactions to evaluate for the investigators, allowing them to spend more time on actual fraudulent transactions instead of false positive cases.
For the second case, randomly generated “smurfing” patterns were injected into the transaction database and predicted with a clustering algorithm (using sklearn). The predicted transactions are presented to the investigator for review.
The benefit of this approach is that the predictions are completely unsupervised.
A new approach to fraud detection and anti-money laundering
The combination of machine learning and graph technology is helping financial organizations reduce false positives and streamline the investigation process. Graph technology improves the investigation effectiveness by reducing silos and providing an exhaustive network view of complex data. Machine learning can assist human experts, reduce the time spent on false positives and really improve the detection of suspicious patterns. Combined, they offer an efficient approach to financial crime detection while helping organizations comply with legal requirements.
About the author
Michael Girardot has a PhD in genetics and specializations in epigenomics and bioinformatics. Graph technology has been an integral part of his research activities on functions of molecular networks. He is currently a Senior Data Scientist at CGI.