Overview
Conductance is a metric used to evaluate the quality of a community or cluster within a graph. Studies have shown that scoring functions that are based on conductance best capture the structure of ground-truth communities.
- J. Yang, J. Leskovec, Defining and Evaluating Network Communities based on Ground-truth (2012)
Concepts
Conductance
Intuitively, a good community should be tightly connected within itself but weakly connected to the rest of the graph.
For a community C
and the rest of the graph C'
, the conductance of C
is defined as the ratio of the cut size (the number of edges crossing between C
and C'
) to the minimum volume of C
and C'
(the sum of the degrees of the nodes within):
In the example below, the community C
is connected to the rest of the graph with three edges, i.e., cut(C, C') = 3
. The conductance of C
is then cond(C) = 3/min(19, 17) = 3/17 = 0.176471
.
If we adjust the cut to inlcude one more node in C
, the conductance becomes cond(C) = 3/min(21, 15) = 3/15 = 0.2
.
A small conductance value is desirable in community detection, where you want to identify dense communities with few edges to the outside. On the contrary, a large conductance value means that the community is loosely connected internally, with many edges reaching out to nodes outside. This suggests that the community is not tight knit.
Example Graph
To create this graph:
// Runs each row separately in order in an empty graphset
create().node_property(@default, "community_id", uint32)
insert().into(@default).nodes([{_id:"A", community_id: 1}, {_id:"B", community_id: 1}, {_id:"C", community_id: 1}, {_id:"D", community_id: 2}, {_id:"E", community_id: 2}, {_id:"F", community_id: 2}, {_id:"G", community_id: 1}, {_id:"H", community_id: 3}, {_id:"I", community_id: 3}, {_id:"J", community_id: 3}, {_id:"K", community_id: 3}])
insert().into(@default).edges([{_from:"A", _to:"B"}, {_from:"A", _to:"C"}, {_from:"A", _to:"D"}, {_from:"A", _to:"E"}, {_from:"A", _to:"G"}, {_from:"D", _to:"E"}, {_from:"D", _to:"F"}, {_from:"E", _to:"F"}, {_from:"G", _to:"D"}, {_from:"G", _to:"H"}, {_from:"J", _to:"D"}, {_from:"I", _to:"H"}, {_from:"I", _to:"J"}, {_from:"H", _to:"K"}, {_from:"J", _to:"K"}])
Creating HDC Graph
To load the entire graph to the HDC server hdc-server-1
as hdc_cond
:
CALL hdc.graph.create("hdc-server-1", "hdc_cond", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
})
hdc.graph.create("hdc_cond", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
}).to("hdc-server-1")
Parameters
Algorithm name: conductance
Name |
Type |
Spec |
Default |
Optional |
Description |
---|---|---|---|---|---|
community_property |
"<@schema.?>property " |
/ | / | No | The numeric node property holds the values representing community IDs. |
File Writeback
CALL algo.conductance.write("hdc_cond", {
params: {
community_property: "community_id"
},
return_params: {
file: {
filename: "conductance"
}
}
})
algo(conductance).params({
project: "hdc_cond",
community_property: "community_id"
}).write({
file: {
filename: "conductance"
}
})
Full Return
CALL algo.conductance("hdc_cond", {
params: {
community_property: "community_id"
},
return_params: {}
}) YIELD r
RETURN r
exec{
algo(conductance).params({
community_property: "community_id"
}) as r
return r
} on hdc_cond
Stream Return
CALL algo.conductance("hdc_cond", {
params: {
community_property: "community_id"
},
return_params: {
stream: {}
}
}) YIELD r
RETURN r
exec{
algo(conductance).params({
community_property: "community_id"
}).stream() as r
return r
} on hdc_cond