Archive | Tutorial

RSS feed for this section

Visualizing the network of Donald Trump

On January 15, BuzzFeed released a large dataset of Donald Trump’s connections, including people, organizations and the nature of their relationships.

In their article “Help Us Map TrumpWorld”, the four authors of the investigation, John Templon, Anthony Cormier, Alex Campbell and Jeremy Singer-Vine, asked the public to help them understand and analyse the data.

Now we are asking the public to use our data to find connections we may have missed, and to give us context we don’t currently understand. We hope you will help us — and the public — learn more about TrumpWorld and how this unprecedented array of businesses might affect public policy.

So we decided to see what it would looks like in Linkurious, our graph analysis and visualization tool.

The dataset is publicly available in a Google spreadsheet. We imported it in a Neo4j graph database using the following script inspired by Michael Hunger’s work:

The result is a graph of 770 nodes and 611 edges. We then connected Linkurious to this dataset to start exploring it. Following, are some visual representations of the TrumpWorld dataset in Linkurious:

An overview of the organizations connected to Trump and their ties.

An overview of the organizations connected to Trump and their ties.

To clarify the visualization, we added some custom labels. All nodes are organizations. The ones whose name contains the word “bank” are half orange. The ones whose name contains the word “hotel” are half-blue. And the ones whose name contains “Trump” or “DT” are half pink. The other ones are entirely green.

As reported by Forbes in May 2016, Trump’s Personal Financial Disclosure (PFD) listed the future President as being associated with 515 different organizations. Among his most valuable assets were the Trump Tower Commercial LLC and the 40 Wall Street LLC. Let’s look up the connections between those two entities.

Graph visualization of the connections between Trump Tower Commercial LLC and 40 Wall Street Commercial LLC.

Graph visualization of the connections between Trump Tower Commercial LLC and 40 Wall Street Commercial LLC.

The Trump Tower Commercial LLc and 40 Wall Street Commercial LLC have seven connections in common. Both companies share a “owns collateralized debt” with investment funds and assess management companies such as The Vanguard Group or Russell Investments.

We can also look into details in DJT Holdings LLC, one of Donald Trump’s most important asset:

Graph visualization of the entities connected to DJT Holdings LLC.

Graph visualization of the entities connected to DJT Holdings LLC.

The visualization shows that DJT Holding LLC is connected to a lot of others companies via a “ownership” relationship. The companies owns more than 30 other organizations, working as a corporate umbrella for the Trump Empire. DJT Holdings is itself owned by the Donald J. Trump Revocable Trust.

Finally if you look into the famous Trump Hotel & Casino Resorts incorporation, we get the following result:

Visualization of the network around Trump Hotel & Casino Resorts, Inc.

Visualization of the network around Trump Hotel & Casino Resorts, Inc.

Around thirty organisation gravitate around the Trump Hotel & Casino Resorts Incorporation as “Subsidiaries”. Once again it confirms the taste of Donald Trump for complicated corporate structures.

Graph visualization tools help understand complex connections within large datasets. TrumpWorld dataset was turned into a clear view of the network of organisations around Donald Trump. We can identify key organizations and scheme quickly.
We will soon update this case with more data, notably the network and connections of people.

If you want to visualize and investigate the TrumpWorld data in Linkurious, simply download the database here, connect it to Neo4j and use Linkurious to visualize it.

Reinforcing AML systems with graph technologies

Fighting financial crimes is a daily battle worldwide. Organizations have to deploy intelligent systems to prevent and detect wrongdoings, such as anti-money laundering (AML) control frameworks. We’ll see in this blog post how graph technologies can reinforce those systems.

Using graph technologies to fight financial crimes

In today’s complex economy, law enforcement and financial organizations fight against a wide range of financial crimes: embezzlement, tax evasion, extortion, corruption, terrorism funding or money laundering, to name a few. While tracking down those activities, governments and financial institutions have to deal with a fast moving financial crime landscape and a growing volume of information of various formats.

Graph technologies like Linkurious can be powerful assets to help fight financial crimes. They provide exhaustive overviews of the different entities and their connections. And they support complex data queries on large data-sets in a near-real time environment.

In this article, we’ll focus on anti-money laundering procedures and explore a specific case with a graph approach.

Strengthening AML controls with network analysis

Money laundering is the act of converting proceeds from criminal activities into legal assets, concealing their true origins. Governments have been steadily strengthening AML rules to prevent those activities. Banking institutions are now required to follow strict AML policies and to report money laundering activity suspicions. Ineffective regulation compliance might be penalized with important financial penalties.

Organizations began to develop risk-based AML frameworks to monitor their customers and financial transactions. But criminals deploy sophisticated tactics to hide their wrongdoings. Shell corporations, tax havens or complex financial schemes are used to prevent identification or tracking of money flows. To thwart such criminal strategies, finding information about a specific suspicious entity is not enough. Financial crime units have to investigate the connections between individuals, accounts, companies, locations, to trace complex transactions. This is why network analysis and visualization technologies turned out to be efficient tools to support AML processes.

We will see below how graph technologies like Linkurious can be an additional asset when it comes to monitoring high risk customers for example.

Financial activities visualized as networks

Banking institutions keep track of numerous information sources about their customers (individuals or companies) and their financial activities. Graph database (GDB) technologies like Neo4j, Titan, AllegroGraph or DataStax Enterprise Graph allow to index complex connected data and easily query them to find patterns. With such systems, organizations can compile various information into a single data model.

A possible graph model of financial informations

A possible graph model of financial information

Linkurious provides an advanced graph interface compatible with numerous graph databases to easily explore and monitor the data.

Identifying money laundering patterns with Linkurious

AML regulations require banks to monitor their high risk customers. Listed on special watchlists, those individuals can be identified either by authorities (e.g Politically Exposed Person, Specially Designated National Lists) or by the institution itself (e.g customers with repeated suspicious transactions). “Are my customers currently involved in activities with flagged individuals?” “If yes, are these activities suspicious?”. Organizations need to be able to answer those questions.

Linkurious offers an interface to monitor graph data in real-time. In addition, analysts can set up alerts for specific patterns with Cypher queries. For instance as an AML analyst, I want to be warned each time there is any type of connection between my customer’s financial activities and my watch-list. I can use the following query to create my alert in the system:

Creation of an alert query

If new data are collected, such as transactions, persons, companies or relationships, Linkurious will automatically update and look for suspicious connections. With the advanced graph visualization interface, it’s then easy to investigate and assess the different cases.

Visual investigation of financial activities

The alert system reported several matches to our query. To evaluate the risk-level of the cases, analysts can use the interface to quickly visualize and investigate. Let’s check one of them:

Visualization of one of the matches signaled by the alert query

Visualization of one of the matches signaled by the alert query

In a glimpse, I see that Angela Marshall (a fictitious individual who figures on my watch-list) is indirectly connected to a transaction on a customer’s bank account. She appears to share the same address as my customer, the company Miboo.

This pattern is relatively suspicious. I might want to explore beyond this single connection and see which other entities are linked to this address.

Investigating the entities linked to the address

Investigating the entities linked to the address

In addition to sharing an address with an individual on a watch-list, our suspicious customer also share his address with three other companies and two of our internal employees. They are all living in the city of Hongqiao, China on 3557 Straubel Circle.

The address, 3557 Straubel Circle is located in Hongqiao, in China

The address, 3557 Straubel Circle is located in Hongqiao, in China

As an analyst, I might recognize a known pattern of money laundering: different companies registered with a unique address. Also, some employees are connected to a known high-risk customer. Those information can be reported to higher authorities to further investigation on the field.

Graph analysis focuses on relationships, therefore helps to discover hidden connections between different entities. Linkurious also operates an alert system in a near-real time environment. That way, financial crime units can identify suspicious activity schemes instantly and reinforce their AML regulation system.

Leverage today the power of graph analysis and visualization to fight financial crimes. Try Linkurious demo or contact us to discuss your project.

Using graphs for intelligence analysis

The identification and monitoring of terrorist or criminal networks are imperatives to detect threats and defeat attacks. Let’s see how Linkurious and graph visualizations can help identify and track potential dangerous individuals and networks.

Challenges for intelligence analysis

Criminal or terrorist activities are rarely the acts of isolated individuals. Behind these activities we find more or less centralized organizations or networks. Intelligence experts are in charge of identifying every actors of such groups, despite their strategies to hide their connections to the networks (encrypted communication services, numerous middlemen, fake identities, etc). Getting the whole picture of the network is essential to monitor suspect activities, prevent attacks or detected potential threats.

Countering such activities is also about gathering as much information as possible, from any possible sources. The more data intelligence and security organisms are able to obtain, the easier it is to track and anticipate criminal or terrorist activities. This means that analysts and investigators have to handle large sets of heterogeneous data.

Graph analysis is particularly suited to this sort of challenge. Graph databases allow organizations to store and query in near real-time the relationships between billions of entities. Let’s see how these systems, combined to tools like Linkurious, can help intelligence analysts identify and investigate threats.

Applying a graph approach to intelligence analysis

We will dive into the investigation of a potential terrorism threat and explore how Linkurious can help identify and investigate suspicious networks.

For this purpose, we have created a dataset with fictitious data about people, including addresses, phone numbers and travel information. This data can easily be modeled as a graph:

Graph data model of our investigation data

Graph data model of our investigation data.

To keep our analysis understandable we chose a very simple model with only a limited volume of data. An authentic situation will definitely involve larger volumes and a wider range of data types.

Data entities, such as individual, email, phone, are modeled as nodes. Relationships between entities are symbolized with edges, labeled with the nature of the connection. The data then forms a network.

In our graph model we have five types of nodes: people, countries, addresses and phone numbers, and as many types of edges, or relationships.
Let’s start our investigation by trying to detect suspicious patterns in our data.

How to use graph patterns to detect potential threats

When dealing with large datasets, we need to find ways to focus the analysts’ attention on relevant information. Here, we want to detect potential terrorist cells. We are going to try to detect groups of at least three people who 1) visited an at-risk country (in our case Syria) and 2) are indirectly in contact (via their addresses or phone communications).

With a simple Cypher script query, Linkurious users can set up a monitoring activity for chosen patterns. Below is the script we will use to identify our pattern:

// Detecting threats:
MATCH (a:Person)-[s:HAS_CONTACTED|HAS_PHONE|HAS_ADDRESS*..10]-(b:Person)-[:HAS_BEEN_TO]->(d:Country {name:’Syria’})
WITH a, collect(s) as rels,collect(distinct b) as suspects,d,count(distinct b) as score
WHERE score > 2
RETURN a,suspects
ORDER BY score DESC

Linkurious reported three individuals: Jessica Wells, Bobby Murphy and Ruth Warren (on the left of the graph). As an analyst, I can visualize them and how they are interconnected. Jessica, Bobby and Ruth display a “has been to” relationship with Syria and appeared to be all connected to a unique phone number: Judy Lewis’ (on the right of the graph).

Visualization of a suspicious network around Jessica, Bobby & Ruth

Visualization of a suspicious network around Jessica, Bobby & Ruth.

Several nodes intermediate between our three people and Judy’s phone number. Phone calls and address are the bridges enabling the connection between our individuals. For analysts, this particular pattern could be pointing toward a recruiting network, with numerous middlemen to avoid detection. Those results could lead to specific recommendations and further investigations.

A graph approach provides the opportunity to detect specific cross-data patterns. With Linkurious, it is easy to visualize and understand both the network and the relationship between its members. Node-edges graph visualizations combine all the available information in a single representation.
Some of the nodes here seem to be connected to other entities. Linkurious allows analysts to interactively explore the data and uncover new information.

Investigate complex network with graph visualization

We identified a potential network with several people. Perhaps they have accomplices? We can try to investigate further, starting from one node of the network. Let’s pick Judy’s phone number for instance and extend the nodes around it.

Investigating Judy’s closest connections via her phone number

Investigating Judy’s closest connections via her phone number.

Judy is connected to a certain Robert Wells, via phone communications, and Robert is himself connected to Theresa Mills’ phone number. If we expand the nodes linked to Theresa’s phone, we get the following visualization.

Visualization of a sub-network around Theresa’s phone number

Visualization of a sub-network around Theresa’s phone number.

The sub-network around Theresa Mills is very specific. The nodes, all linked together, are phone numbers associated to seven individuals. Such pattern -a  small highly connected group with a unique bridge to other potential suspects – represents a sub-network within the larger network we are investigating.

From a single node, we went up to another group, gathering new information about the network. Interactive and scalable tools like Linkurious ease the exploration and analysis for experts.

Visualize and analyse intelligence and security data with Linkurious

Graph approaches are well suited for the investigation of criminal network and terrorist groups. Linkurious offers to intelligence agents a unique entry point to identify hidden insights in complex connected data. Analysts can determine specific pattern to monitor suspicious activities. The visualization interface allows them to navigate between the nodes to identify new key actors through hidden connections.

Discover how you can identify hidden insights in your graph data and try the demo of Linkurious.

Graph data visualisation for cyber-security threats analysis

 In this blog post, we will offer an overview on how to deal with Security information and event management/log management (SIEM/LM) data overflow. Let’s see how Linkurious’ advanced graph visualisation solution helps easily identify and investigate cyber-security threats.

Switching to a data lake architecture is often a required first step for analysts who wish to use graph data visualisation solutions such as Linkurious to start visualising their SIEM/LM data. Linkurious enables analysts to deal with SIEM/LM data overflow and perform precise real-time and/or post-attack forensics analysis. In the second part, we will demonstrate the extent of Linkurious’ possibilities using a real life SIEM/LM data-set use case and perform a forensics analysis example.

Dealing with SIEM/LM data overflow: putting security analysts back in control

SIEM/LM solutions have evolved continuously over the last 15 years to match the ever changing landscape of cyber-security threats. SIEM/LM solutions aim to provide analysts all the necessary information and context they need to determine the nature of an attack, its degree of sophistication and of proliferation inside the network. To efficiently contain security breach damages and react efficiently, analysts need the right information at the right time.

Today, it still remains a considerable challenge for organisations of all sizes to meet their necessary operational, audit and security needs. As networks become more and more complex, the number of devices to monitor has significantly increased. Analysts are literally overflowed with data. Because of that, aggregating these different SIEM/LM data sources together has become a challenge in itself. These significant framework limitations disable analysts. They have too much data, but not enough information. There is a real need to reduce the scale and complexity of the analysis to a more intelligible level in order for analysts to come up with appropriate solutions to improve overall security. Advanced data visualisation solutions enable just that.

But for the moment, SIEM/LM solutions still rarely include data visualisation tools. Even if they do, they are not efficient at treating such big amounts of data and do not offer real-time pattern detection and exploration possibilities. Right now most companies relying on SIEM/LM data visualisation solutions only use them for illustrations purposes rather than for analysis. They often have to rely on external services to carry out post-attack forensics as these operations require a lot of skill and time.

Using graph data visualisation tackles this problem and makes SIEM/LM data operational again

Today, the trend in the cyber-security world to resolve these issues is to switch from the traditional data warehouse framework to more flexible and scalable data backends. This enables the use of new tools such as graph data visualisation analytics solutions. Typically these new backends take the form of data lake frameworks: often Hadoop combined with other services such as graph databases and other analytics tools. Data Lakes have many advantages compared to data warehouses when it comes to managing terabytes of security logs: centralisation, flexibility, operationality, and high scalability. Companies who are serious about using new analytics applications such as Linkurious for their SIEM/LM data will have to make the switch sooner or later. One might also add that depending on the company’s needs, the switch can be fairly non-intrusive for the overall existing system architecture.

How Linkurious empowers security analysts

Once the SIEM/LM data is centralised into the data lake, using a graph data visualisation solution like Linkurious to explore and investigate the data provides analysts with a real added value for their everyday operations. They are operational in real time, can visualise the data instantly and can carry out precise post-attack forensics analysis in much simpler ways than ever before. The detection of suspicious activity patterns can be largely automated using pattern recognition algorithms. That way, analysts can focus on investigating suspicious activity visually.

Visualisation is empowering for analysts as it resolves to a great extent the problem of having large amounts of data to interpret. Visualisation considerably reduces the scale and complexity of the analysis. It also allows companies to carry out most of their forensics analysis internally. With Linkurious’ advanced collaboration and security features, analysts are able to work together, share visualisations between them, and administer user access rights to the data. Finally, the advanced customisation possibilities that Linkurious offers allows its integration into internal security systems.

Next, we will demonstrate Linkurious’ possibilities using a real-life SIEM/LM dataset to see the advantages of graph visualisation technology to monitor networks in real-time and perform advanced forensics analysis.

Using Linkurious for cyber-security: a real-life use case

This dataset was created using a real life log archive of an enterprise network. Courtesy of the University of Victoria who created and made public the dataset for general research purposes. The dataset is the combination of several existing publicly available malicious and non-malicious SIEM/LM log datasets. The dataset reproduces the day to day usage of an enterprise network. More information on the dataset here.

The PCAP files were generated with Wireshark and we converted it into a CSV file. We then generated several CSV files to model the dataset and import it into Neo4j.

Modelling

We used the following model for the Neo4j database:

Cyber-Security Linkurious

Import Script:

cd C:\Users\linkurious\Downloads\neo4j-community-3.0.1-windows\neo4j-community-3.0.1\bin

neo4j-import –USING PERIODIC COMMIT 1000 –skip-bad-relationships –C:\Users\linkurious\Downloads\neo4j-community-3.0.0-RC1-windows\neo4j-community-3.0.0-RC1\bin –nodes nodeip.src.csv –nodes nodeport.csv –relationships Relationshipdst.portip.dst.csv –relationships RelationshipIP.srcdst.port.csv –into C:\

The connections were aggregated together with the start date and end date to reduce the number of edges. Creating an edge for each transmitted packet would create super nodes and make the graph very difficult to read. The model we use is very simple, but the modeling can be made to fit very specific use cases depending on what the analyst is looking for.

Using Linkurious to identify a UMTP storm botnet

Linkurious enables analysts to visualise data that is otherwise seemingly difficult to conceptualise. Experienced analysts know what “normal” behaviours looks like on the network they manage. This makes it possible for them to set pattern detection algorithms that will pull up abnormal behaviours from the database. For example, the following visualisation shows a “normal” interaction in the network. IP’s interact with a wide variety of different service ports of 131.243.125.208. 

normal behaviour Linkurious cyber-security

Normal activity on the network

On the other hand, here is an abnormal behaviour pattern. Most of the IPs that connect to “172.16.0.11”use port 25 (SMTP Port) and don’t generate any other traffic than that on any other services. This is suspicious in itself. But the large number of IP’s doing the same operation at the same time seem to indicate a botnet network carrying out a UDP storm attack. These attacks are basically a denial of service attack (DoS).  

Cyber-Security graph Visualisation Linkurious

A UDP storm attack

If a geolocation service fetches the GPS coordinates of the IP addresses, it is possible to visualise them directly on a map. In one click, using Linkurious geospatial visualisation feature, we can see that most of the IPs that are part of the botnet network are in the same region. Most of them come from around Odessa in Ukraine.

Graph data visualisation for cyber-security threats analysis

Geospatial representation of the IP adresses of the UDP Botnet attack

Graph data visualisation for cyber-security threats analysis

Zoom in to the most concentrated activity region

botnet attack Graph data visualisation for cyber-security threats analysis

Most of the toxic traffic comes from Ukraine around Odessa

 

We can then explore the activity of specific IP addresses and see which services were affected by their activity. For example, the address “12.166.237.145” has other links that we haven’t examined yet. Let’s examine it separately and expand it to see all its connections. That way, we see it links to another IP on our network: “172.16.0.12”.

otherattack2

Exploring 12.166.237.145 connections on the network

If we expand the IP address “172.16.0.12” to see its connections, we find it is connected to another attack. This means the two are probably linked together and that the network was maybe compromised several times. The attack follows the same pattern as the first SMTP storm attack we just saw.

botnet attack graph visualisation

A second botnet attack

Linkurious: graph data visualisation for cyber-security threats analysis

This simple use case shows the great potential graph visualisation technology has for cyber-security analysts. Analysts can now start to make sense of their connected data and investigate any suspicious behaviours on their network. Graph Visualisation offers a high level of precision for analysts to quickly understand any kind of security event. Assessing the degree of sophistication of an attack and reacting accordingly becomes easier than ever before.

Once the company’s data framework ready for graph data visualisation  Linkurious will become a solid ally for all security analysts. The multiple possibilities that solutions like Linkurious offer enable analysts to overcome the overflow of SIEM/LM data and extract the information they need. Graph visualisation has the potential to reduce the complexity of their analysis, making SIEM/LM data operational. Forensic analysis also becomes less expensive as it is now possible to conduct it internally more often.

Graph technology enables the automation of a large part of the detection process. That way, analysts can focus on investigating the security alerts on the network. Linkurious’ collaboration features also enable them to work together more efficiently and rapidly. Linkurious meets all security standards for such sensible data and provides all the necessary tools to administrate user rights access. Using a graph-based approach also offers many advantages when working with non-technical users and other departments inside the company because of its inherent simplicity. Who doesn’t understand nodes and edges?

Want to explore and understand your graph data? Simply try the demo of Linkurious or contact us!

Investigating Enron’s email corpus: The trail of Tim Belden

In fraud and white collar crimes, forensic investigators often have to go through massive amounts of complex connected data to gather proofs and evidence for their cases. In the recent years, the development of graph databases and data visualization tools have made it much easier to quickly find information that would have taken days to find by other means. Let’s see how Linkurious can help investigate a real life email network dataset to establish responsibilities or proofs of guilt. We’ll use real emails coming from Enron, one of the biggest financial scandal in US history.

Investigating Enron’s emails

In October 2001, the U.S. Securities and Exchange Commission (SEC) began investigating what would rapidly become known worldwide as the Enron scandal. The energy company had been using accounting loopholes and offshore platforms to conceal billions of dollar of debt in its financial reports for years. It was also found to have manipulated the Californian and Canadian energy market to push prices up artificially to increase its profit. The scandal eventually led to Enron’s bankruptcy making it the biggest company reorganization in American history at the time. Many executives were indicted and trialed.

During its investigation, the Federal Energy Regulatory Commission (FERC) made the controversial decision to publish online all of the company’s emails for transparency, historical and academic research purposes. The “Enron email corpus”, as it is now widely known, constitutes the largest public domain database of real world company e-mails in the world and has been used in a very large range of studies and research projects worldwide.

Importing the email corpus into Neo4j

To start exploring the corpus, we needed to import it into a Neo4J graph database. In order to do so, we relied heavily on Arne Hendrik Schulz’s work and his MySQL 4 dumps of the dataset that we turned into CSV files. The result is a graph with 328,209 nodes and 2,317,231 relationships. You can learn more about how to import large datasets into Neo4j here.

Enron email corpus

Our graph model is pretty simple, we have 2 types of nodes: persons and emails. Persons are linked to emails by “HAS_RECEIVED” and “HAS_SENT” relationships. We could use Linkurious to explore the email contents themselves, but for this article, our interest is more to explore the network of key executives in the scandal to see if we can find interesting information that could be useful for investigators.

Investigating Tim Belden’s network

Tim Belden, the head of trading at Enron, was one of the first executives to be prosecuted and to admit wrongdoings at Enron. He pled guilty on charges of conspiracy to commit wire fraud as part of a plea bargain and agreed to cooperate with the authorities to help convict many top Enron executives. He’ll be the starting point of our fictive investigation. Let’s see if we can find relevant information just by analysing his email activity.

The first problem we have to deal with here is that a lot of emails he sent and received were directed to many recipients. The ones that are really interesting to us as investigators are his personal emails. An easy and quick way to isolate them is to expand only the least connected nodes in his sent and received emails. That way, we find the interlocutors with whom he had direct one-to-one contact. This method is really effective if we do not need a 100% precision level to explore the data.

A quick look at the graph shows that he used his email address primarily to send emails to the Enron’s World Trade center Office: ‘center.dl-portland@enron.com’. But he did send a few emails to individuals inside the company as well.

Enron Belden network

Belden’s sent email activity

Now, if we get rid of all the emails sent to the WTC office and add the 200 least connected emails he received we get a map of all his interactions inside Enron. After cleaning the uninteresting emails we see that his primary interlocutors inside the company were: John Lavoreto, Jeff Dasovich, Kevin M. Presto, Philip K. Ellen, Louise Kitchen and Kate Symes, all top executives at Enron. Dasovich was Enron’s governmental affairs executive, Presto was Vice President, Lavoreto and Kitchen were senior traders, and Ellen and Symes were both traders as well.

Enron Belden emails

Belden’s cleaned email activity map

Assessing Belden’s relationships

Now let’s play the part of a forensic investigator who wants to assess Belden’s Relationships inside the company. Lavoreto appears to be by far the individual with whom he had the most interactions even though he only sent a few emails to him. With such information, an investigator could have decided to investigate their relationship furthermore. Doing so he could have discovered a conversation between the two proving that they both knew Enron was actively manipulating the Canadian energy market in August 2000. The scam operation was called project Stanley. As the FERC most probably lacked the tools to explore the dataset efficiently, this story only came out in 2005. If they had had a tool like Linkurious they would have been able to spot significative relationships more easily and would have known which emails to drill into.

download (1)

Shortest paths between Lavoreto and Dasovich

Now, we can also investigate whether the people in Belden’s first circle knew each other. An easy and effective way to do this is to use the “find the shortest path” feature of Linkurious. For example, let’s check if Lavoreto and Dasovich interacted together directly. Instantly we see that they never exchanged any private emails but only received the same chain emails with many recipients.

On the other hand, Lavoreto and Presto did have many private email interactions. It could be interesting to investigate their relationship as well since they are both connected to Belden.

Lavoreto Kevin interactions Enron

Shortest paths between Lavoreto and Presto

A quick search on google tells us that the FERC established in 2002 that “Presto’s role paralleled that of Tim Belden” and that he was also involved in project Stanley too. Using the dataset we can establish that Balden, Lavoreto and Presto were part of the same circle inside and communicated together.

Querying the dataset

Now let’s see how we can return nodes that fit more complex patterns and criteria in the dataset. Cypher queries, Neo4j’s query language, can be entered directly in Linkurious. For example, this request returns all the nodes ending by “@enron.com” that never sent any emails. This could be a potentially useful query if the investigators suspect some emails were deleted from the dataset and they wish to check which email addresses were altered.

// Cypher request:
MATCH (p:Person)-[s:HAS_RECEIVED]->(m:Mail)
WHERE p.email =~ “.*@enron.com”
RETURN p
LIMIT 10

Here we have three results, but it doesn’t seem to highlight any wrongdoing on Enron’s side:

Cypher query enron

 

Another good example of graph query would be to find all the personal emails connected to a person. The following query returns all the emails that have less than 3 connections and were sent or received by Tim Belden:

// Cypher request:
match (n)–()
with n,count(*) as rel_cnt
where 1<rel_cnt<=2
match (n:Mail)–(:Person{email:”tim.belden@enron.com”})
return n

Cypher query Enron Belden

The result is nearly exactly the same as what we had earlier when we expanded Belden’s least connected emails, except this time we’re sure not to have missed any nodes that fit the criteria we have set. It is just a more rigorous and precise way of obtaining a map of his interlocutors, but at least we’re sure not to miss a single email!

 

If anything this exercise demonstrates the power of graph visualisation when investigating or auditing a network. Without even having read the emails, we managed to establish who belonged in Belden’s first circle inside Enron and established that some people in his network knew each other as well. It turned out that Belden, Lavorato and Presto indeed knew about project Stanley and were all potentially involved in it. Linkurious is the perfect tool to investigate social networks in detail, find key people and communities, establish responsibilities and relationships. Linkurious can be used to conduct large-scale audits or investigation inside large organisations of any kind.

Want to explore and understand your graph data? Simply try the demo of Linkurious!

Insurance Fraud Investigation using Linkurious

In the US alone, insurance fraud cost companies around 80 billion dollars each year. Being able to detect fraud schemes before the fraudsters have been able to access the funds they’re trying to steal is a major advantage for insurance companies and their customers. Nevertheless, the detection and investigation processes remain difficult for these institutions. Most of them simply lack the appropriate tools to detect complex fraud patterns that blend with normal user behavior and to investigate complex fraud cases.

Linkurious enables deep and efficient visual investigation of suspicious patterns in your data. Here we’ll take a look at property damage claims and how Linkurious can help analysts investigate fraudulent looking cases.

Using graphs to spot insurance fraud

Insurance fraud is rarely the work of an isolated individual. Fraudsters usually form complex networks that are difficult to detect for insurance and financial institutions. One of the most common techniques used by fraudsters is to forge fake identities, file several claims and cash the insurance checks. Creating fake identities requires to forge or usurp personal information like social security numbers (SSN), addresses, credit cards, etc. These pieces of information are then submitted by the fraudsters to the insurance companies as they become customers. Forging new information for each fake identity they create comes at a high cost for fraudsters. This is why they often recycle this data to create several fake identities.

insurance fraud network example

A graph approach can help us spot suspicious connections

The picture above shows two customers and what they are connected to. Each customer has a unique personal address, phone number and email but somehow they share the same SSN number, which is normally unique for each individual.

A graph approach makes it possible to spot suspicious fraud insurance patterns in large datasets.

Limits in current fraud investigation tools

Typically, insurance companies rely on relational databases (RDBMS) to store their customer data. RDBMS were designed in the 80’s to codify paper forms and tabular structures. They do this task very well and remain one of the best tools for storing and organizing data. Nevertheless, when it comes to querying and visualising important amounts of connected data, RDBMS do not perform well as they were not designed for this purpose. In these cases RDBMS often lack important features, are slow, not flexible in terms of modelisation and are unable to query the data in real time. All of these issues make them difficult to use for fraud, where analysts need to identify and investigate suspicious connections.

Linkurious makes it possible to overcome these issues. It’s easy to use interface simplify the job of analysts and allows data scientists or developers to leverage the power of graph databases like Neo4j. Detection of suspicious patterns at database level is greatly facilitated using the Cypher graph query language. Once these patterns are isolated, being able to visualise them instantly with Linkurious enables deep and efficient investigations.

Investigating suspicious looking patterns in data with Linkurious is fast and intuitive for analysts. Analysts with no particular technical backgrounds can thus carry out complex investigations.

How Linkurious helps analysts investigate insurance fraud cases

Linkurious makes it possible to visualize and understand how different entities are connected. It gives analysts the capability to quickly distinguish between a real insurance fraud case and non-relevant alerts, saving them precious time.

Here is what a  ‘normal’ insurance customer looks like in Linkurious. The customer is connected to  a single claim, SSN, phone number, email address and address. We can see that he doesn’t share any of his personal details with other existing customers or known fraudsters. A lawyer and an evaluator are connected to the claim.

Normal customer linkurious

A normal looking customer in Linkurious.

In a simple glimpse, it’s possible to understand that 8 different entities are connected and assess that the situation seems normal. As an analyst, this picture can be used as a “template” that will make identifying fraud cases easier.

Now, let’s look at two other customers and their claims. The visualization is very different from the normal template we just saw. This situation should thus directly attract the attention of the analysts.

The fact that the two customers share an address and the same last name means that we are probably looking at a couple. The situation seems normal. The investigator can choose to dismiss the case, put a low priority investigation if a doubt remains or chose to keep an eye on it to track interactions with the nodes and react quickly if something suspicious happens.

False Positive Linkurious

Visualising a false-positive that would have otherwise been flagged as suspicious

Linkurious helps save precious resources and avoid customer dissatisfaction.

Let’s look at three new customers and the property damage claims they are connected to. These claims are under the supervision of the same lawyer and the same evaluator. Curiously the two customers that instigated the three property damage claims, John Piggyback and Werner Stiedemann, are both linked to a third existing customer, Paula Smith. Piggyback has the same phone number and Stiedemann the same email address as her. This is an abnormal situation as neither of the two share her name or address.

insurance fraud linkurious

Visualizing a potential insurance fraud case.

This situation is very likely to be an insurance fraud. The fraud investigators can block the transactions for all three cases and launch further investigations and/or legal proceedings.

image00

Linkurious makes it possible to leverage the power of graph databases via a simple interface. Data scientists and developers can design queries to spot potential fraud cases using the pattern-matching capabilities of Neo4j. Analysts focus on the visual investigation of the suspicious cases.

The visualization capabilities of Linkurious means fraud analysts can quickly evaluate if the cases identified by the algorithms are false positives or serious cases. Reviewing these cases with Linkurious before blocking any transaction can be very useful in order not to treat genuine clients like potential criminals and negatively impact their customer experience. Linkurious also makes it easier for investigators to dismantle entire fraud networks at once by tracing their entire ramifications and not forget a key element. Finally, using the collaboration tools to  communicate with other analysts and with the authorities eases the whole investigation and prosecution process, making it one of the most complete graph database visualisation solutions out there.

Want to explore and understand your graph data? Simply try the demo of Linkurious!