Ein Artikel von Dr. Christian Michael Schneider, ehemaliger Mitarbeiter DYMATRIX
Introduction to Network Theory
This blog series is about the concept of network theory and how it can help to meet business needs.
One of the main tasks of advanced analytics in CRM is understanding customer behavior. Traditionally customer decisions are predicted based on three kind of information: product quality, socio-demographic data, as well as historical transaction data. For example, if a customer is planning a vacation trip, his vacation history, his income, hotel and deal quality are used to predict the customer’s choice. However, people also base their decisions on their social environment – in this example his friends’ preferred destinations. These kind of information can be mined and utilized with the concept of network theory.
Since not all readers are familiar with network theory, the blog series starts with a short introduction in network theory before showing business applications in later posts. Thus, if you are an expert in the field of network theory you can skip the rest of this post, otherwise you can enjoy the beauty of the concept.
It all starts with the definition of a network. Formally, networks are independent entities, called nodes (also vertices), which are connected to each other through, so called, edges (also links). The number of edges of a node is called degree. In real networks the degree distribution of the network is key for the function of the Network. Known examples for networks are infrastructures, like power grids, consisting of power plants (the nodes) and transmission lines (the edges) as shown in the following figure on the example of the European power grid :
In this case the connectivity is key for the functionality, since the power frequency should be constant (e.g. 50 Hz in Europe). The more power plants can be synchronized the more stable the frequency. On the other hand, the synchronization of power plants has also drawbacks. For example, a breakdown of single transmission lines can lead to huge blackouts not only locally at the malfunctioning transmission line, but also at the entire power grid. One famous example is the breakdown of a transmission line from southern Switzerland, which lead to a blackout in entire Italy.
From the network point of view, all these described effects can be modelled: The connectivity can be linked to the existence of a path between each pair of nodes through edges, which is associated with the largest connected component (which consists of all nodes and edges connected by paths). The malfunctioning of transmission lines can be modelled as – random or intentional –removal of edges from the network and black outs are associated with the collapse of the largest connected component.
Another example for networks in life are social networks, like friendship or business networks. Friendship networks consists of people (nodes) and social interactions (edges). These networks are not visible, however they are very important for a wide range of spreading effects like information and disease spreading . While information spreading is more important for business application, disease spreading is more imaginable, thus the focus here is in the latter case.
Usually diseases occur locally, e.g. SARS in Asia. However, the disease was spreading from people from Asia to people in other countries, since people travel and meet others for some reasons (e.g. for business purpose). When one person in the meeting is sick, the disease can spread to others in the meeting. This effect can be modeled with dynamic processes on networks. In this case people are nodes and the social connections are edges. The nodes also get a health state (sick, infected, recovered) as well as the edges get a transmission probability to pass the disease to a friend (someone who is connected with an edge). With this setup, one can model the disease spreading over time, but one can also start to test immunization strategies.
After having tasted the predictive and modelling power of network theory, the next blog post will be about identifying relevant networks for business needs:
 Mitigation of malicious attacks on networks, CM Schneider et al. – Proceedings of the National Academy of Sciences 108 (10), 2011 – http://www.pnas.org/content/108/10/3838.full
 Suppressing epidemics with a limited amount of immunization units, CM Schneider et al. – Physical Review E 84 (6), 2011