Label Propagation - Analytics & Algorithms

Change Password

Submit

Change Email

Submit

Change Nickname

Current Nickname:

Submit

Profile

Account ID:

Full Name:
Phone:
Company:
Company Email:
Country:
Language:

Change Password

Apply

You have no license application record.

Apply

Certificate	Issued at	Valid until	Serial No.	File

Serial No.	Valid until	File

Not having one? Apply now! >>>

Product	Created On	ID	Amount (USD)	Invoice

Product	Created On	ID	Amount (USD)	Invoice

No Invoice

Create Ultipa Account

Already have an Ultipa account? Sign in now!

Forgot Password

Reset Password

Back to sign in

Label Propagation

Overview

Label Propagation Algorithm (LPA) is a community detection algorithm based on label propagation; it is an iterative process of nodes aggregation with the goal of label distribution stabilization. Although the results of the algorithm have a certain degree of randomness, it is very suitable for large complex networks which owes to its low time complexity and no need to specify the number of communities in advance. It is often included in the benchmark testing report in both academia and industry.

Related material of the algorithm:

U.N. Raghavan, R. Albert, S. Kumara, Near linear time algorithm to detect community structures in large-scale networks (2007)

Basic Concept

Label

Label is the value of a specified property of node. Label can be propagated to the 1-step neighbors of the node, nodes with label can adopt label from its 1-step neighbors. When detecting the communities, label represents the community of the node, the propagation of label is the expansion of the community.

Consider the propagation of label a and b of the red and yellow nodes in the graph above:

Blue node is the common neighbor of the two nodes, it will decide which label to adopt based on the weight of label a and b, or to adopt both labels if it is allowed.
Red node has self-loop edge, so label a also propagates to itself.
Label b cannot propagate to the green node during one round of propagation; if the purple node (green node's only neighbor) adopts label b, the green node will adopt label b during the next round of propagation.

Ultipa's LPA supports full label propagation, that is, all initial labels are valid and able to propagate. In case there is a need to specify part of the labels as valid, please contact Ultipa team for algorithm customization.

Label Probability

If a node can only adopt only one label from its neighbors, it is called single-label propagation, while if multiple labels are allowed to adopt, it is called multi-label propagation.

In multi-label propagation, after a node adopts multiple labels, each label will be given a probability which is proportional to the label weight, and the sum of all label probabilities of a node equals to 1. For single-label propagation, the case is much easier as the node's only label takes a probability of 1.

For example, if a node only adopts one label, the label probability is 1; if it adopts labels a and b with weights 1.5 and 1 respectively, the probabilities of a and b are 1.5/(1+1.5) = 0.6 and 1/(1+1.5) = 0.4 respectively.

Label Weight

Node j has label l, when it propagates to its neighbor node i, the weight of label l equals to the product of the probability of l, the weight of j, and the total weight of all edges between i and j; when i has multiple neighbor nodes with label l, the weight of each l needs to be summed:

where j is any neighbor node of i that has label l , p(l) is the probability of l, w(j) is the weight of j, A(ij) is the total weight of all edges between i and j.

It is needed to specify a numeric property of node as node weight, and a numeric property of edge as edge weight.

Label Adoption

The principle of label adoption is to keep the top one or several labels with the largest weights among the neighbor labels, and to calculate the label probabilities. When some labels have the same probabilities which affects the adoption decision, the system would select randomly from these parallel labels.

For any node i in the graph, l is the neighbor label of i, w(l) is the weight of l, the new label l of i can be denoted as:

As mentioned previously, under the circumstances of multi-label propagation, the algorithm normalizes the label weight, that is, converts label weight into a weight probability.

When the algorithm begins, initial labels are assigned to nodes based on the algorithm parameters; during each iteration, it calculates for each node whether there is any label different from its current labels can be adopted from its neighbors, and after all nodes are calculated, updates the nodes which can adopt new labels. Iterating and looping by this rule until no node could adopt new labels, or the number of iterations reaches the limit.

Ultipa's LPA follows the synchronous update principle when updating node labels, the algorithm has a corresponding interrupt mechanism against the label oscillations (mostly found in bipartite graph) that might occur.

Special Case

Lonely Node, Disconnected Graph

For disconnected graph, there is no edge between its lonely nodes and connected components to propagate labels, connected components do not contain the same initial labels will not be divided into the same community.

Self-loop Edge

LPA treats the self-loop edge in the same way with the Degree algorithm algo(degree), which counts each self-loop edge twice.

Directed Edge

For directed edges, LPA ignores the direction of edges but calculates them as undirected edges.

Results and Statistics

Run LPA in the graph below, all node and edge weights are 1, the letter wrote in the node is the initial label of node, node 16 has no label, the algorithm iterates 5 rounds:

Algorithm results 1: Keep maximum one label for each node, return _uuid, label_1 and probability_1

_uuid	label_1	probability_1
1	C	1.000000
2	C	1.000000
3	C	1.000000
4	C	1.000000
5	C	1.000000
6	H	1.000000
7	H	1.000000
8	H	1.000000
9	H	1.000000
10	H	1.000000
11	N	1.000000
12	N	1.000000
13	N	1.000000
14	N	1.000000
15	O	1.000000
16

Algorithm statistics 1: Number of labels label_count

label_count
4

Algorithm results 2: Keep maximum two labels for each node, return _uuid, label_1, probability_1, label_2 and probability_2

_uuid	label_1	probability_1	label_2	probability_2
1	C	0.705578	A	0.294422
2	C	0.658854	A	0.341145
3	A	0.508263	C	0.491737
4	C	0.688864	E	0.311136
5	A	0.572178	C	0.427822
6	H	0.723903	A	0.276097
7	H	0.797693	I	0.202307
8	H	0.645654	I	0.354346
9	H	0.779780	A	0.220220
10	H	0.617815	I	0.382185
11	N	0.611782	M	0.388218
12	N	0.611782	M	0.388218
13	N	0.766861	M	0.233139
14	N	0.663694	M	0.336306
15	O	1.000000
16

Algorithm statistics 2: Number of labels label_count

label_count
8

Command and Configuration

Command: algo(lpa)
Configurations for the parameter params():

Name	Type	Default	Specification	Description
node_label_property	`@<schema>.<property>`	/	Numeric or string class node property, LTE needed	Name of the node schema and property as label; node without this property will not participate in the calculation; random number is used as label if not set
node_weight_property	`@<schema>.<property>`	/	Numeric node property, LTE needed	Name of the node schema and property as node weight; node without this property will not participate in the calculation; weight is 1 if not set
edge_weight_property	`@<schema>.<property>`	/	Numeric edge property, LTE needed	Name of the edge schema and property as edge weight; edge without this property will not participate in the calculation; weight is 1 if not set
k	int	1	>=1	The maximum number of labels each node can keep; labels are ordered by probability
loop_num	int	5	>=1	Number of iterations

Example: Run LPA, label locates on node property @default.name, iterate 5 rounds and keep no more than two labels for each node

algo(lpa).params({
  node_label_property: @default.name,
  k: 2,
  loop_num: 5
  }) as a,b return a,b

Algorithm Execution

Task Writeback

1. File Writeback

Configuration	Data in Each Row
filename	`_id`,`label_1`,`probability_1`,...`label_k`,`probability_k`

Example: Run LPA, label locates on node property @default.name, iterate 5 rounds and keep no more than two labels for each node, write the algorithm results back to file named lpa

algo(lpa).params({
  node_label_property: @default.name,
  k: 2,
  loop_num: 5
}).write({
  file:{
    filename: "lpa"
  }
})

2. Property Writeback

Configuration	Writeback Content	Type	Data Type
property	`label_1`, `probability_1`, ... `label_k`, `probability_k`	Node property	Data type of label is `string`, data type of label probability is `float`

Example: Run LPA, label locates on node property @default.name, iterate 5 rounds and keep no more than two labels for each node, write the algorithm results back to node properties named newName_1, probability_1, newName_2 and probability_2

algo(lpa).params({
  node_label_property: @default.name,
  k: 2,
  loop_num: 5
}).write({
  db:{
    property: "newName"
  }
})

3. Statistics Writeback

Name	Data Type	Description
`label_count`	int	Number of labels

The number of labels means the number of labels left after the algorithm is finished; when k = 1, since each node only keeps one label, the number of labels is the number of communities.

Example: Run LPA, label locates on node property @default.name, iterate 5 rounds and keep no more than two labels for each node, write the algorithm statistics back to task information

algo(lpa).params({
  node_label_property: @default.name,
  k: 2,
  loop_num: 5
}).write()

Direct Return

Alias Ordinal	Type	Description	Column Name
0	[]perNode	Node and its label, label probability	`_uuid`, `label_1`, `probability_1`, ... `label_k`, `probability_k`
1	KV	Number of labels	`label_count`

Example: Run LPA, label locates on node property @default.name, iterate 5 rounds and keep no more than one label for each node, define algorithm results and statistics as aliases named results and count, and return the results and statistics

algo(lpa).params({
  node_label_property: @default.name,
  k: 1
}) as results, count 
return results, count

Streaming Return

Alias Ordinal	Type	Description	Column Name
0	[]perNode	Node and its label, label probability	`_uuid`, `label_1`, `probability_1`, ... `label_k`, `probability_k`

Example: Run LPA, label locates on node property @default.name, iterate 5 rounds and keep no more than one label for each node, and return the results ordered by the ascending label name and node UUID

algo(lpa).params({
  node_label_property: @default.name,
  k: 1,
  loop_num: 5
}).stream() as lpa 
return lpa order by lpa.label_1, lpa._uuid

Real-time Statistics

Alias Ordinal	Type	Description	Column Name
0	KV	Number of labels	`label_count`

Example: Run LPA, label locates on node property @default.name, iterate 5 rounds and keep no more than one label for each node, and return the number of communities

algo(lpa).params({
  node_label_property: @default.name,
  k: 1,
  loop_num: 5
}).stats() as count 
return count

Consistency of Results

Affected by factors such as the order of nodes, the random selection of labels with the same weights, the parallel calculation and so on, the community division results of LPA may be different.

ID
Product
Status
Cores
Applied Validity Period(days)
Effective Date
Excpired Date
Mac Address
Apply Comment
Review Comment