Overview
Harmonic Centrality is a variant of Closeness Centrality. The average shortest distance measurement proposed by harmonic centrality is compatible with infinite values which would occur in disconnected graph. Harmonic centrality was first proposed by M. Marchiori and V. Latora in 2000, and then by A. Dekker and Y. Rochat in 2005 and 2009:
- M. Marchiori, V. Latora, Harmony in the Small-World (2000)
- A. Dekker, Conceptual Distance in Social Network Analysis (2005)
- Y. Rochat, Closeness Centrality Extended to Unconnected Graphs: The Harmonic Centrality Index (2009)
Harmonic centrality takes on values between 0 to 1, nodes with higher scores have shorter distances to all other nodes.
Concepts
Shortest Distance
The shortest distance of two nodes is the number of edges contained in the shortest path between them. Please refer to Closeness Centrality for more details.
Harmonic Mean
Harmonic mean is the inverse of the arithmetic mean of the inverses of the variables. The formula for calculating the arithmetic mean A
and the harmonic mean H
is as follows:
A classic application of harmonic mean is to calculate the average speed when traveling back and forth at different speeds. Suppose there is a round trip, the forward and backward speeds are 30 km/h and 10 km/h respectively. What is the average speed for the entire trip?
The arithmetic mean A = (30+10)/2 = 20 km/h
does not seem reasonable in this case. Since the backward journey takes three times as long as the forward, during most time of the entire trip the speed stays at 10 km/h, so we expect the average speed to be closer to 10 km/h.
Assuming that one-way distance is 1, then the average speed that takes travel time into consideration is 2/(1/30+1/10) = 15 km/h
, and this is the harmonic mean, it is adjusted by the time spent during each journey.
Harmonic Centrality
Harmonic centrality score of a node defined by this algorithm is the inverse of the harmonic mean of the shortest distances from the node to all other nodes. The formula is:
where x
is the target node, y
is any node in the graph other than x
, k-1
is the number of y
, d(x,y)
is the shortest distance between x
and y
, d(x,y) = +∞
when x
and y
are not reachable to each other, in this case 1/d(x,y) = 0
.
The harmonic centrality of node a in the above graph is (1 + 1/2 + 1/+∞ + 1/+∞) / 4 = 0.375
, and the harmonic centrality of node d is (1/+∞ + 1/+∞ + 1/+∞ + 1) / 4 = 0.25
.
Harmonic Centrality algorithm consumes considerable computing resources. For a graph with V nodes, it is recommended to perform (uniform) sampling when V > 10,000, and the suggested number of samples is the base-10 logarithm of the number of nodes (
log(V)
).
For each execution of the algorithm, sampling is performed only once, centrality score of each node is computed based on the shortest distance between the node and all sample nodes.
Considerations
- The harmonic centrality score of isolated nodes is 0.
Example Graph
To create this graph:
// Runs each row separately in order in an empty graphset
create().node_schema("user").edge_schema("vote")
create().edge_property(@vote, "score", uint32)
insert().into(@user).nodes([{_id:"A"},{_id:"B"},{_id:"C"},{_id:"D"},{_id:"E"},{_id:"F"},{_id:"G"},{_id:"H"}])
insert().into(@vote).edges([{_from:"A", _to:"B", score:2}, {_from:"A", _to:"E", score:3}, {_from:"B", _to:"B", score:4}, {_from:"B", _to:"C", score:2}, {_from:"C", _to:"A", score:3}, {_from:"D", _to:"A", score:1}, {_from:"F", _to:"G", score:1}])
Creating HDC Graph
To load the entire graph to the HDC server hdc-server-1
as hdc_hc
:
CALL hdc.graph.create("hdc-server-1", "hdc_hc", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
})
hdc.graph.create("hdc_hc", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
}).to("hdc-server-1")
Parameters
Algorithm name: harmonic_centrality
Name |
Type |
Spec |
Default |
Optional |
Description |
---|---|---|---|---|---|
ids |
[]_id |
/ | / | Yes | Specifies nodes for computation by their _id ; computes for all nodes if it is unset. |
uuids |
[]_uuid |
/ | / | Yes | Specifies nodes for computation by their _uuid ; computes for all nodes if it is unset. |
direction |
String | in , out |
/ | Yes | Specifies that the shortest paths shoud only contain incoming edges (in ) or outgoing edges (out ). |
edge_schema_property |
[]"<@schema.?><property> " |
/ | / | Yes | Numeric edge properties used as weights, summing values across the specified properties; edges without this property are ignored. |
impl_type |
String | dijkstra , delta_stepping , spfa , beta |
beta |
Yes | Specifies the weighted shortest paths to be computed by the Dijkstra, Delta-Stepping, SPFA or the default (beta ) Ultipa shortest path algorithm. This is only valid when edge_schema_property is used. |
sample_size |
Integer | -1 , -2 , [1, |V|] |
-2 |
Yes | Specifies the sampling strategy for computation; Sets to -1 to sample log(|V|) nodes, or a number between [1, |V|] to sample a specific number of nodes (|V| is the total number of nodes in the graph). Sets to -2 to perform no sampling. This option is only valid when all nodes are involved in the computation. |
return_id_uuid |
String | uuid , id , both |
uuid |
Yes | Includes _uuid , _id , or both to represent nodes in the results. |
limit |
Integer | ≥-1 | -1 |
Yes | Limits the number of results returned; -1 includes all results. |
order |
String | asc , desc |
/ | Yes | Sorts the results by harmonic_centrality . |
File Writeback
CALL algo.harmonic_centrality.write("hdc_hc", {
params: {
return_id_uuid: "id",
order: "desc"
},
return_params: {
file: {
filename: "harmonic"
}
}
})
algo(harmonic_centrality).params({
project: "hdc_hc",
return_id_uuid: "id",
order: "desc"
}).write({
file: {
filename: "harmonic"
}
})
Result:
_id,harmonic_centrality
A,0.571429
B,0.428571
C,0.428571
D,0.357143
E,0.357143
F,0.142857
G,0.142857
H,0
DB Writeback
Writes the harmonic_centrality
values from the results to the specified node property. The property type is float
.
CALL algo.harmonic_centrality.write("hdc_hc", {
params: {},
return_params: {
db: {
property: 'hc'
}
}
})
algo(harmonic_centrality).params({
project: "hdc_hc"
}).write({
db:{
property: 'hc'
}
})
Full Return
CALL algo.harmonic_centrality("hdc_hc", {
params: {
return_id_uuid: "id",
ids: ["A", "B"],
edge_schema_property: "score"
},
return_params: {}
}) YIELD hc
RETURN hc
exec{
algo(harmonic_centrality).params({
return_id_uuid: "id",
ids: ["A", "B"],
edge_schema_property: "score"
}) as hc
return hc
} on hdc_hc
Result:
_id | harmonic_centrality |
---|---|
A | 0.309523 |
B | 0.219048 |
Stream Return
CALL algo.harmonic_centrality("hdc_hc", {
params: {
direction: "in",
return_id_uuid: "id"
},
return_params: {
stream: {}
}
}) YIELD hc
FILTER hc.harmonic_centrality = 0
RETURN hc
exec{
algo(harmonic_centrality).params({
direction: "in",
return_id_uuid: "id"
}).stream() as hc
where hc.harmonic_centrality == 0
return hc
} on hdc_hc
Result:
_id | harmonic_centrality |
---|---|
D | 0 |
F | 0 |
H | 0 |