Overview
Preferential attachment is a common phenomenon in complex network where nodes with more connections are more likely to establish new connections. When both nodes possess a large number of connections, the probability of them forming a connection is significantly higher. This phenomenon was utilized by A. Barabási and R. Albert in their proposed BA model for generating random scale-free networks in 2002:
- R. Albert, A. Barabási, Statistical mechanics of complex networks (2001)
The Preferential Attachment algorithm gauges the similarity between two nodes by calculating the product of the number of neighbors each node has. It is computed using the following formula:
where N(x) and N(y) are the sets of adjacent nodes to nodes x and y respectively.
Higher Preferential Attachment scores indicate greater similarity between nodes, while a score of 0 indicates no similarity between two nodes.
In this example, PA(D,E) = |N(D)| * |N(E)| = |{B, C, E, F}| * |{B, D, F}| = 4 * 3 = 12.
Considerations
- The Preferential Attachment algorithm ignores the direction of edges but calculates them as undirected edges.
Example Graph
To create this graph:
// Runs each row separately in order in an empty graphset
insert().into(@default).nodes([{_id:"A"}, {_id:"B"}, {_id:"C"}, {_id:"D"}, {_id:"E"}, {_id:"F"}, {_id:"G"}])
insert().into(@default).edges([{_from:"A", _to:"B"}, {_from:"B", _to:"E"}, {_from:"C", _to:"B"}, {_from:"C", _to:"D"}, {_from:"C", _to:"F"}, {_from:"D", _to:"B"}, {_from:"D", _to:"E"}, {_from:"F", _to:"D"}, {_from:"F", _to:"G"}])
Creating HDC Graph
To load the entire graph to the HDC server hdc-server-1
as hdc_tlp
:
CALL hdc.graph.create("hdc-server-1", "hdc_tlp", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
})
hdc.graph.create("hdc_tlp", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
}).to("hdc-server-1")
Parameters
Algorithm name: topological_link_prediction
Name |
Type |
Spec |
Default |
Optional |
Description |
---|---|---|---|---|---|
ids |
[]_id |
/ | / | No | Specifies the first group of nodes for computation by their _id ; computes for all nodes if it is unset. |
uuids |
[]_uuid |
/ | / | No | Specifies the first group of nodes for computation by their _uuid ; computes for all nodes if it is unset. |
ids2 |
[]_id |
/ | / | No | Specifies the second group of nodes for computation by their _id ; computes for all nodes if it is unset. |
uuids2 |
[]_uuid |
/ | / | No | Specifies the second group of nodes for computation by their _uuid ; computes for all nodes if it is unset. |
type |
String | Preferential_Attachment |
Adamic_Adar |
No | Specifies the similarity type; for Preferential Attachment, keep it as Preferential_Attachment . |
return_id_uuid |
String | uuid , id , both |
uuid |
Yes | Includes _uuid , _id , or both to represent nodes in the results. |
limit |
Integer | ≥-1 | -1 |
Yes | Limits the number of results returned; -1 includes all results. |
File Writeback
CALL algo.topological_link_prediction.write("hdc_tlp", {
params: {
ids: ["C"],
ids2: ["A","E","G"],
type: "Preferential_Attachment",
return_id_uuid: "id"
},
return_params: {
file: {
filename: "pa.txt"
}
}
})
algo(topological_link_prediction).params({
project: "hdc_tlp",
ids: ["C"],
ids2: ["A","E","G"],
type: "Preferential_Attachment",
return_id_uuid: "id"
}).write({
file: {
filename: "pa.txt"
}
})
Result:
_id1,_id2,result
C,A,3
C,E,6
C,G,3
Full Return
CALL algo.topological_link_prediction("hdc_tlp", {
params: {
ids: ["C"],
ids2: ["A","C","E","G"],
type: "Preferential_Attachment",
return_id_uuid: "id"
},
return_params: {}
}) YIELD pa
RETURN pa
exec{
algo(topological_link_prediction).params({
ids: ["C"],
ids2: ["A","C","E","G"],
type: "Preferential_Attachment",
return_id_uuid: "id"
}) as pa
return pa
} on hdc_tlp
Result:
_id1 | _id2 | result |
---|---|---|
C | A | 3 |
C | E | 6 |
C | G | 3 |
Stream Return
MATCH (n)
RETURN collect_list(n._id) AS IdList
NEXT
CALL algo.topological_link_prediction("hdc_tlp", {
params: {
ids: ["C"],
ids2: IdList,
type: "Preferential_Attachment",
return_id_uuid: "id"
},
return_params: {
stream: {}
}
}) YIELD pa
FILTER pa.result >= 6
RETURN pa
find().nodes() as n
with collect(n._id) as IdList
exec{
algo(topological_link_prediction).params({
ids: ["C"],
ids2: IdList,
type: "Preferential_Attachment",
return_id_uuid: "id"
}).stream() as pa
where pa.result >= 6
return pa
} on hdc_tlp
Result:
_id1 | _id2 | result |
---|---|---|
C | B | 12 |
C | D | 12 |
C | E | 6 |
C | F | 9 |