Linkurious Enterprise

How to detect bank loan fraud with graphs: part 1

April 15, 2014

6mins

Everyday bank and insurance companies are victims of fraud. Criminal target them, open accounts, ask for loans and credit cards…and some day disappear. Eventually, banks have to write off the money loaned to fraudsters. It is estimated that in Canada alone, the cost of this fraud scheme is around $1B per year. We are going to see how graph analytics are one of the best weapon anti-fraud teams have to fight back.

This article will be divided in two parts. In the first one, we’ll see how fraudsters operate. This will help us to identify weaknesses and prepare us for the second part of the article. Next week, we’ll see how to capitalize on these weaknesses and use graphs to catch criminals. Ready?

How to scam banks and make millions : first-party fraud, identity theft and fake identities

Criminals are an imaginative bunch. Experienced fraud and risk analysts will encounter a lot of techniques used by ill-intentioned individuals looking to make money. Among these techniques, some relate to first-party fraud and some to third-party fraud.

Third-party fraud is what most people think of when they think about about fraud. In this case a criminal steals another person’s identifying information to commit his crime. This method works so well that it has bolstered a black market where real identities information are sold. The price for your credit card number, social security number, expiration date and mother’s maiden name? Probably around $5!

First party-fraud is a lesser known situation. In this case, the criminal will use is own identity or build a fake identity to defraud the bank.

In practice the line between these two scenarios can get blurry. The most powerful technique based on fake identities is synthetic identity fraud. Synthetic identity fraud consists in establishing a fake identity using a combination of real and fake information. But how does this work?

The art of forging fake identities

Using a fake identity for fraud is alluring. When the fraud is done, there will be no individual left to complain and the bank will lose time looking for a ghost. But it does take a fair bit of organization.

Stolen identity information can be a good starting point. Sometimes all it takes is one real document like a social security number. Equipped with this, a fraudster can forge a first identity. He will invent a name, age, date of birth, address, phone number, etc. The goal is to create a (fake) identity that has all the characteristics of a real one. Fraudsters will make sure that their identity can bypass the security checks of their targets. This is where having a “real”, verifiable, piece of identity (like a social security number) is useful.

The fraudster now has a first fake identity. He can start creating new identities. By slightly altering the information he used to create his first identity, he can give birth to fake individuals : a “John Doe”, a “Jane Doe” and a “John Smith”. All it takes is changing a name, modifying a phone number and re-using the rest of the identity information.

Once it has started, the process gets easier. Using a “fake” identity, someone can obtain new documents : driver’s license, phone bills, bank account information, etc. After a while, an experienced fraudster becomes effective at recycling information pieces and create new identities with minimal effort.

Fraudsters are effectively creating and managing a graph of fake identities. These fake identities are connected to bank accounts, phone numbers, social security numbers, identity papers, addresses, etc. As the information is often recycled, connections exist between fake identities that share one or more pieces of information.

2 suspicious customers share an address, a phone number and a SSN — Modeling customer information with a graph : two suspicious customers share an address, a phone number and a SSN.

From the perspective of the fraudster, there are a few challenges associated with this approach :

generating new identities : the fraudster has to constantly mix existing information and fake information to build identities ;
keeping track of everything : the fraudster has to know very well the fake identities to use them ;

If you’re familiar with graph databases, you probably know that working with graph data is not easy. Especially if you do not have the right tools. We’ll see in the second part of this article that the challenges the criminals face are good opportunities for fraud and risk analysts. Empowered by graph analytics, banks can fight back against less graph-savvy fraudsters and catch them.

Executing the attack

Before we actually start talking about identifying fraudsters, let’s dive further in their methods. We have seen that they like to generate fake identities and use them to open up accounts. How can they make money out of this? It’s time to see how fraudsters attack banks.

The basic scenario is this : a criminal opens up an account with a bank, ask for a credit card, run up a debt and disappears without repaying it. The smallest fraud can amount to a few thousand dollars and take place in a matter of weeks.

More sophisticated criminals will go for larger payoffs. This can be done with a set of fake identities and a lot of patience. Recently the Canadian police was able to uncover a well-organized fraud ring. A five-month investigation called Operation Mouse unmasked a synthetic identities scheme responsible for $25 million in fraud losses in which credit card bills and mortgages were never repaid. This case offers a rare glimpse on the methods of criminals.

Here is how the fraudsters worked :

the ring leaders generated synthetic identities ;
his associates paid individuals to assume the synthetic identities and open-up bank accounts ;
the fraudsters used the accounts “normally” : they paid for goods with their cards, repaid their credit cards ;
as they built up a good reputation with their banks, the fraudster asked for loans and increased their credit card limits ;
one day they emptied the accounts, maxed out their credit cards and disappeared ;

This method takes patience and organization. The crucial part for fraudsters is to appear as normal as possible. It may take years of legitimate behavior. At this stage the fraudsters are apparently great customers. Progressively, they can earn the trust of banks. That trust is then transformed in large bank loans. When the criminals feel like they have obtained as much as they can, they unwind their scheme. In Operation Mouse, the criminals made an estimated $25 million.

Are banks and their customers doomed to accept fraud as the price of doing business?

Thankfully, fraudsters do get arrested as in the case of Operation Mouse. But why can they succeed anyway? Large banks like HSBC or JPMorgan have millions of customers. Finding the criminals among these customers is hard. Especially when dealing with individuals who are sophisticated, organized and patient. What can be done to identify the criminals then?

Banks have a lot of data on their customers, normally they could use it to identify suspicious people. How come they have such a hard time detecting fake identities? Is it normal that individuals can recycle pieces of information to create different accounts with the same bank?

This is actually not really surprising. Detecting loan fraud and fake identities is all about connections. Traditional, table-oriented databases are notoriously bad at handling connected data. As a consequence, the tools built on top of it, business intelligence or data visualization solutions, struggle with connections too. This is the reason why banks have such a hard time identifying the common pieces of information that connect fraudsters together.

Graph databases on the other hand are perfect for modelling, storing and querying connected data. They are key in fighting back against fraud. Let's now see how banks can use graph analytics to catch fraudsters.