Overview
The p-Cohesion algorithm identifies groups of network players (nodes) that are highly connected with each other, represented by cohesive subgraphs. It provides valuable insights into the level of connectivity and interdependence within these groups, enabling in-depth analysis of the graph structure and its implications.
The concept of p-cohesion was first proposed by S. Morris in a contagion model of the interaction among large populations:
- S. Morris, Contagion. The Review of Economic Studies, 67(1), 57–78 (2000)
Concepts
p-Cohesion
One natural measure of the 'cohesion' of a group is the relative frequency of ties among group members compared to non-members. Let the cohesion be a constant p ∈ (0,1), a p-cohesion is a connected subgraph in which every node has, at least, a proportion p of its neighbors within the subgraph, i.e., at most, a proportion (1 − p) of its neighbors outside.
The p-Cohesion model offers two distinct advantages compared to other cohesive subgraph models:
- With a large p value, a p-cohesion ensures not only inner-cohesiveness, but also outer-sparseness.
- In many scenarios, considering the percentage of neighbors rather than a fixed number of neighbors (such as the k value in k-Core) is more appropriate due to variations in node degrees.
Below shows an example graph. Suppose p = 0.6, a grey label is put next to each node indicating the smallest number of neighbors required for the node to stay in a p-cohesion.
Below are the minimal (in terms of the number of nodes) p-cohesion subgraphs including node a and node j respectively.
Ultipa's p-Cohesion algorithm finds the approximate minimal p-cohesion subgraph for each query node, and returns each subgraph in the form of its node set.
Considerations
- The p-Cohesion algorithm ignores the direction of edges but calculates them as undirected edges.
Example Graph
To create this graph:
// Runs each row separately in order in an empty graphset
insert().into(@default).nodes([{_id:"A"}, {_id:"B"}, {_id:"C"}, {_id:"D"}, {_id:"E"}, {_id:"F"}, {_id:"G"}, {_id:"H"}, {_id:"I"}, {_id:"J"}, {_id:"K"}, {_id:"L"}])
insert().into(@default).edges([{_from:"K", _to:"J"}, {_from:"K", _to:"L"}, {_from:"J", _to:"L"}, {_from:"L", _to:"C"}, {_from:"C", _to:"A"}, {_from:"A", _to:"B"}, {_from:"C", _to:"B"}, {_from:"A", _to:"D"}, {_from:"B", _to:"G"}, {_from:"B", _to:"D"}, {_from:"D", _to:"C"}, {_from:"C", _to:"E"}, {_from:"C", _to:"F"}, {_from:"D", _to:"E"}, {_from:"E", _to:"F"}, {_from:"D", _to:"F"}, {_from:"D", _to:"H"}, {_from:"I", _to:"H"}, {_from:"F", _to:"I"}])
Creating HDC Graph
To load the entire graph to the HDC server hdc-server-1
as hdc_pcohesion
:
CALL hdc.graph.create("hdc-server-1", "hdc_pcohesion", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
})
hdc.graph.create("hdc_pcohesion", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
}).to("hdc-server-1")
Parameters
Algorithm name: p_cohesion
Name |
Type |
Spec |
Default |
Optional |
Description |
---|---|---|---|---|---|
ids |
[]_id |
/ | / | Yes | Specifies each node by its _id to find the approximate minimal p-cohesions that include it; specifies all nodes if it is unset. |
uuids |
[]_uuid |
/ | / | Yes | Specifies each node by its _uuid to find the approximate minimal p-cohesions that include it; specifies all nodes if it is unset. |
p |
Float | (0,1) | / | No | For each node in a p-cohesion, at least a proportion p of its neighbors are within the p-cohesion, and no more than a proportion (1−p) are outside it. |
return_id_uuid |
String | uuid , id , both |
uuid |
Yes | Includes _uuid , _id , or both to represent nodes in the results. |
File Writeback
CALL algo.p_cohesion.write("hdc_pcohesion", {
params: {
ids: ["A","I"],
p: 0.7,
return_id_uuid: "id"
},
return_params: {
file: {
filename: "cohesion"
}
}
})
algo(p_cohesion).params({
project: "hdc_pcohesion",
ids: ["A","I"],
p: 0.7,
return_id_uuid: "id"
}).write({
file: {
filename: "cohesion"
}
})
Result:
subgraph contains A: D,F,B,A,E,C,
subgraph contains I: I,D,F,H,B,A,E,C,
Stats Writeback
CALL algo.p_cohesion.write("hdc_pcohesion", {
params: {
ids: ["A","I"],
p: 0.7,
return_id_uuid: "id"
},
return_params: {
stats: {}
}
})
algo(p_cohesion).params({
project: "hdc_pcohesion",
ids: ["A","I"],
p: 0.7,
return_id_uuid: "id"
}).write({
stats: {}
})
Result:
max size of subgraphs |
---|
8 |
Stats Return
CALL algo.p_cohesion("hdc_pcohesion", {
params: {
ids: ["A","I"],
p: 0.7
},
return_params: {
stats: {}
}
}) YIELD s
RETURN s
exec{
algo(p_cohesion).params({
ids: ["A","I"],
p: 0.7
}).stats() as s
return s
} on hdc_pcohesion
Result:
max size of subgraphs |
---|
8 |