Overview
A random walk begins at a particular node in graph and proceeds by randomly moving to one of its neighboring nodes; this process is often repeated for a defined number of steps. This concept was introduced by the British mathematician and biostatistician Karl Pearson in 1905, and it has since become a cornerstone in the study of various systems, both within and beyond graph theory.
- K. Pearson, The Problem of the Random Walk (1905)
Concepts
Random Walk
Random walk is a mathematical model employed to simulate a series of steps taken in a stochastic or unpredictable manner, like the erratic path of a drunken person.
The basic random walk is performed in a one-dimensional space: a node initiates from the origin of a number line and moves up or down by one unit at a time with equal likelihood. An example of a 10-step random walk is as follows:
Here is an example of performing this random walk multiple times, with each walk consisting of 100 steps:
Random Walk in Graph
In a graph, a random walk is a process where a path is formed by starting from a node and moving sequentially through neighboring nodes. This process is controlled by the walk depth, which determines the number of nodes to be visited.
Ultipa's Random Walk algorithm implements the classical form of random walk. By default, each edge is assigned the same weight (equal to 1), resulting in equal probabilities of traversal. When edge weights are specified, the likelihood of traversing those edges becomes proportional to their weights. It's important to note that various variations of random walk exist, such as Node2Vec Walk and Struc2Vec Walk.
Considerations
- Self-loops are also eligible to be traversed during the random walk.
- Random walk cannot start from an isolated node as there are no adjacent edges to proceed to.
- The Random Walk algorithm ignores the direction of edges but calculates them as undirected edges.
Example Graph
To create this graph:
// Runs each row separately in order in an empty graphset
create().edge_property(@default, "score", float)
insert().into(@default).nodes([{_id:"A"},{_id:"B"},{_id:"C"},{_id:"D"},{_id:"E"},{_id:"F"},{_id:"G"},{_id:"H"},{_id:"I"},{_id:"J"},{_id:"K"}])
insert().into(@default).edges([{_from:"A", _to:"B", score:1}, {_from:"A", _to:"C", score:3}, {_from:"C", _to:"D", score:1.5}, {_from:"D", _to:"C", score:2.4}, {_from:"D", _to:"F", score:5}, {_from:"E", _to:"C", score:2.2}, {_from:"E", _to:"F", score:0.6}, {_from:"F", _to:"G", score:1.5}, {_from:"G", _to:"J", score:2}, {_from:"H", _to:"G", score:2.5}, {_from:"H", _to:"I", score:1}, {_from:"I", _to:"I", score:3.1}, {_from:"J", _to:"G", score:2.6}])
Creating HDC Graph
To load the entire graph to the HDC server hdc-server-1
as hdc_randomWalk
:
CALL hdc.graph.create("hdc-server-1", "hdc_randomWalk", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
})
hdc.graph.create("hdc_randomWalk", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
}).to("hdc-server-1")
Parameters
Algorithm name: random_walk
Name |
Type |
Spec |
Default |
Optional |
Description |
---|---|---|---|---|---|
ids |
[]_id |
/ | / | Yes | Specifies nodes to start random walk by their _id ; computes for all nodes if it is unset. |
uuids |
[]_uuid |
/ | / | Yes | Specifies nodes to start random walk by their _uuid ; computes for all nodes if it is unset. |
walk_length |
Integer | ≥1 | 1 |
Yes | Depth of each walk, i.e., the number of nodes to visit. |
walk_num |
Integer | ≥1 | 1 |
Yes | Number of walks to perform for each specified node. |
edge_schema_property |
[]"<@schema.?><property> " |
/ | / | Yes | Numeric edge properties used as edge weights, summing values across the specified properties; edges without the specified properties are ignored. |
return_id_uuid |
String | uuid , id , both |
uuid |
Yes | Includes _uuid , _id , or both values to represent nodes in the results. |
limit |
Integer | ≥-1 | -1 |
Yes | Limits the number of results returned; -1 includes all results. |
File Writeback
CALL algo.random_walk.write("hdc_randomWalk", {
params: {
return_id_uuid: "id",
walk_length: 6,
walk_num: 2
},
return_params: {
file: {
filename: "walks"
}
}
})
algo(random_walk).params({
project: "hdc_randomWalk",
return_id_uuid: "id",
walk_length: 6,
walk_num: 2
}).write({
file:{
filename: 'walks'
}})
Result:
_ids
J,G,H,G,F,D,
D,C,D,C,A,C,
F,G,H,I,I,I,
H,G,H,I,H,G,
B,A,C,E,C,D,
A,C,D,C,D,C,
E,C,E,F,E,C,
C,D,C,E,F,D,
I,I,I,H,G,J,
G,J,G,J,G,H,
J,G,J,G,F,E,
D,C,E,C,D,F,
F,D,C,A,B,A,
H,I,I,I,H,I,
B,A,B,A,C,E,
A,C,D,C,A,B,
E,F,G,F,D,F,
C,E,F,E,F,D,
I,I,H,I,I,I,
G,H,I,I,H,I,
Full Return
CALL algo.random_walk("hdc_randomWalk", {
params: {
return_id_uuid: "id",
walk_length: 6,
walk_num: 2,
edge_schema_property: 'score'
},
return_params: {}
}) YIELD walks
RETURN walks
exec{
algo(random_walk).params({
return_id_uuid: "id",
walk_length: 6,
walk_num: 2,
edge_schema_property: 'score'
}) as walks
return walks
} on hdc_randomWalk
Result:
_ids |
---|
["J","G","J","G","J","G"] |
["D","F","E","C","E","C"] |
["F","D","F","D","F","G"] |
["H","I","I","I","I","H"] |
["B","A","C","A","C","D"] |
["A","C","A","B","A","B"] |
["E","C","E","F","D","C"] |
["C","A","C","D","F","D"] |
["I","H","I","I","I","I"] |
["G","H","G","J","G","J"] |
["J","G","J","G","J","G"] |
["D","F","D","C","E","C"] |
["F","D","C","D","C","E"] |
["H","I","H","G","J","G"] |
["B","A","C","D","F","G"] |
["A","C","D","C","A","C"] |
["G","J","G","F","D","F"] |
["H","I","I","I","I","H"] |
["F","D","F","D","F","G"] |
["D","F","E","C","E","C"] |
["J","G","J","G","J","G"] |
Stream Return
CALL algo.random_walk("hdc_randomWalk", {
params: {
return_id_uuid: "id",
walk_length: 5,
walk_num: 1,
edge_schema_property: '@default.score'
},
return_params: {
stream: {}
}
}) YIELD walks
RETURN walks
exec{
algo(random_walk).params({
return_id_uuid: "id",
walk_length: 5,
walk_num: 1,
edge_schema_property: '@default.score'
}).stream() as walks
return walks
} on hdc_randomWalk
Result:
_ids |
---|
["J","G","J","G","J"] |
["D","F","G","J","G"] |
["F","G","F","D","C"] |
["H","G","H","G","J"] |
["B","A","C","D","F"] |
["A","C","A","C","A"] |
["E","F","D","F","D"] |
["C","D","F","D","F"] |
["I","I","I","I","I"] |
["G","H","G","J","G"] |