Overview
The HANP (Hop Attenuation & Node Preference) algorithm extends the traditional Label Propagation algorithm (LPA) by incorporating a label score attenuation mechanism and considering the influence of neighbor node degree on neighbor label weight. The goal of HANP is to improve the accuracy and robustness of community detection in networks, it was proposed in 2009:
- I.X.Y. Leung, P. Hui, P. Liò, J. Crowcroft, Towards real-time community detection in large networks (2009)
Concepts
Hop Attenuation
HANP associates each label with a score which decreases as it propagates from its origin. All labels are initially given a score of 1. Each time a node adopts new label from its neighborhood, a new attenuated score would be assigned to this new label by subtracting the hop attenuation δ (0 < δ < 1).
The hop attenuation mechanism limits the propagation of labels to nearby nodes and prevents them from spreading too broadly across the network.
Node Preference
In the calculation of the new maximal label, HANP incorporates node preference based on node degree. When node j ∈ Ni propagates its label L to node i, the weight of label L is calculated by:
where,
- sj(L) is the score of label L in j.
- degj is the degree of j. When m > 0, more preference is given to node with high degree; m < 0, more preference is given to node with low degree; m = 0, no node preference is applied.
- wij is the sum of edge weights between i and j.
As the edge weights and label scores denoted in the example below, set m = 2 and δ = 0.2, the label of the blue node will be updated from d
to a
, and the score of label a
in the blue node will be attenuated to 0.6.
Considerations
- HANP ignores the direction of edges but calculates them as undirected edges.
- Node with self-loops propagates its current label(s) to itself, and each self-loop is counted twice.
- When the selected label is equal to the current label, let δ = 0.
- HANP follows the synchronous update principle when updating node labels. This means that all nodes update their labels simultaneously based on the labels of their neighbors. The label score mechanism can prevent label oscillations.
- Due to factors such as the order of nodes, the random selection of labels with equal weights, and parallel calculations, the community division results of HANP may vary.
Example Graph
To create this graph:
// Runs each row separately in order in an empty graphset
create().node_schema("user").edge_schema("connect")
create().node_property(@user,"interest",string).edge_property(@connect,"strength",int32)
insert().into(@user).nodes([{_id:"A",interest:"flute"}, {_id:"B",interest:"football"}, {_id:"C",interest:"piano"}, {_id:"D",interest:"violin"}, {_id:"E",interest:"piano"}, {_id:"F",interest:"movie"}, {_id:"G",interest:"piano"}, {_id:"H",interest:"tennis"}, {_id:"I",interest:"violin"}, {_id:"J",interest:"badminton"}, {_id:"K",interest:"swimming"}, {_id:"L",interest:"cello"}, {_id:"M",interest:"saxophone"}, {_id:"N",interest:"novel"}, {_id:"O",interest:"swimming"}])
insert().into(@connect).edges([{_from:"A",_to:"B",strength:3}, {_from:"A",_to:"C",strength:5}, {_from:"A",_to:"F",strength:8}, {_from:"A",_to:"K",strength:6}, {_from:"B",_to:"C",strength:2}, {_from:"C",_to:"D",strength:9}, {_from:"D",_to:"A",strength:5}, {_from:"D",_to:"E",strength:6}, {_from:"E",_to:"A",strength:5}, {_from:"F",_to:"G",strength:9}, {_from:"F",_to:"J",strength:4}, {_from:"G",_to:"H",strength:10}, {_from:"H",_to:"F",strength:3}, {_from:"I",_to:"H",strength:4}, {_from:"I",_to:"F",strength:2}, {_from:"J",_to:"I",strength:1}, {_from:"K",_to:"F",strength:1}, {_from:"K",_to:"N",strength:10}, {_from:"L",_to:"M",strength:1}, {_from:"L",_to:"N",strength:4}, {_from:"M",_to:"N",strength:8}, {_from:"M",_to:"K",strength:10}, {_from:"N",_to:"M",strength:4}, {_from:"O",_to:"N",strength:1}])
Creating HDC Graph
To load the entire graph to the HDC server hdc-server-1
as hdc_hanp
:
CALL hdc.graph.create("hdc-server-1", "hdc_hanp", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
})
hdc.graph.create("hdc_hanp", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
}).to("hdc-server-1")
Parameters
Algorithm name: hanp
Name |
Type |
Spec |
Default |
Optional |
Description |
---|---|---|---|---|---|
node_label_property |
"<@schema.?><property> " |
/ | / | Yes | Numeric or string node property used to initialize node labels; nodes without the specified property are ignored. The system will generates the labels if it is unset. |
edge_weight_property |
"<@schema.?><property> " |
/ | / | Yes | Numeric edge property used as the edge weights. |
m |
Float | / | 0 |
Yes | The power exponent of the neighbor node degree:
|
delta |
Float | [0, 1] | 0 |
Yes | Hop attenuation δ. |
loop_num |
Integer | ≥1 | 5 |
Yes | Number of propagation iterations. |
return_id_uuid |
String | uuid , id , both |
uuid |
Yes | Includes _uuid , _id , or both to represent nodes in the results. |
limit |
Integer | ≥-1 | -1 |
Yes | Limits the number of results returned; -1 includes all results. |
File Writeback
CALL algo.hanp.write("hdc_hanp", {
params: {
return_id_uuid: "id",
loop_num: 10,
edge_weight_property: "strength",
m: 2,
delta: 0.2
},
return_params: {
file: {
filename: "hanp.txt"
}
}
})
algo(hanp).params({
project: "hdc_hanp",
return_id_uuid: "id",
loop_num: 10,
edge_weight_property: "strength",
m: 2,
delta: 0.2
}).write({
file: {
filename: "hanp.txt"
}
})
DB Writeback
Writes each label_1
and its score_1
from the results to the specified node properties. The property types are string
and float
, respectively.
CALL algo.hanp.write("hdc_hanp", {
params: {
node_label_property: "@user.interest",
m: 0.1,
delta: 0.3
},
return_params: {
db: {
property: "lab"
}
}
})
algo(hanp).params({
project: "hdc_hanp",
node_label_property: "@user.interest",
m: 0.1,
delta: 0.3
}).write({
db: {
property: "lab"
}
})
The label and its score of each node is written to new properties lab_1
and score_1
.
Stats Writeback
CALL algo.hanp.write("hdc_hanp", {
params: {
node_label_property: "@user.interest",
m: 0.1,
delta: 0.3
},
return_params: {
stats: {}
}
})
algo(hanp).params({
project: "hdc_hanp",
node_label_property: "@user.interest",
m: 0.1,
delta: 0.3
}).write({
stats: {}
})
Full Return
CALL algo.hanp("hdc_hanp", {
params: {
return_id_uuid: "id",
loop_num: 12,
node_label_property: "@user.interest",
m: 1,
delta: 0.2
},
return_params: {}
}) YIELD r
RETURN r
exec{
algo(hanp).params({
return_id_uuid: "id",
loop_num: 12,
node_label_property: "@user.interest",
m: 1,
delta: 0.2
}) as r
return r
} on hdc_hanp
Stream Return
CALL algo.hanp("hdc_hanp", {
params: {
loop_num: 12,
node_label_property: "@user.interest",
m: 1,
delta: 0.2
},
return_params: {
stream: {}
}
}) YIELD r
RETURN r.label_1 AS label, count(r) GROUP BY label
exec{
algo(hanp).params({
loop_num: 12,
node_label_property: "@user.interest",
m: 1,
delta: 0.2
}).stream() as r
group by r.label_1 as label
return table(label, count(r))
} on hdc_hanp
Stats Return
CALL algo.hanp("hdc_hanp", {
params: {
loop_num: 5,
node_label_property: "interest",
m: 0.6,
delta: 0.2
},
return_params: {
stats: {}
}
}) YIELD s
RETURN s
exec{
algo(hanp).params({
loop_num: 5,
node_label_property: "interest",
m: 0.6,
delta: 0.2
}).stats() as s
return s
} on hdc_hanp