Change Password

Please enter the password.
Please enter the password. Between 8-64 characters. Not identical to your email address. Contain at least 3 of: uppercase, lowercase, numbers, and special characters.
Please enter the password.
Submit

Change Nickname

Current Nickname:
Submit

Apply New License

License Detail

Please complete this required field.

  • Ultipa Graph V4

Standalone

Please complete this required field.

Please complete this required field.

The MAC address of the server you want to deploy.

Please complete this required field.

Please complete this required field.

Cancel
Apply
ID
Product
Status
Cores
Applied Validity Period(days)
Effective Date
Excpired Date
Mac Address
Apply Comment
Review Comment
Close
Profile
  • Full Name:
  • Phone:
  • Company:
  • Company Email:
  • Country:
  • Language:
Change Password
Apply

You have no license application record.

Apply
Certificate Issued at Valid until Serial No. File
Serial No. Valid until File

Not having one? Apply now! >>>

Product Created On ID Amount (USD) Invoice
Product Created On ID Amount (USD) Invoice

No Invoice

v5.0
Search
    English
    v5.0

      Fast Random Projection

      HDC

      Overview

      Fast Random Projection (or FastRP) is a scalable and performant algorithm for learning node representations in a graph. It achieves an iterative computation of node embeddings with two components: first, node similarity matrix construction; second, dimension reduction by random projection.

      FastRP was proposed by H. Chen et al. in 2019:

      FastRP Framework

      The authors of FastRP propose that most network embedding methods essentially consist of two components: (1) construct a node similarity matrix or implicitly sample node pairs from it, and then (2) apply dimension reduction techniques to this matrix.

      Regarding the dimension reduction, many popular algorithms, like Node2Vec, rely on time-consuming methods such as Skip-gram. The authors argue that the success of these methods should be attributed to the proper construction of this similarity matrix, rather than the dimension reduction method employed.

      Therefore, FastRP utilizes very sparse random projection, which is a scalable optimization-free method for dimension reduction.

      Very Sparse Random Projection

      Random projection is a dimension reduction method that roughly preserves the relative pairwise distances between data points when embedding them from the high-dimensional space into a lower-dimensional space. The theoretical guarantees of random projection are mainly derived from the Johnson-Lindenstrauss Lemma.

      The idea behind random projection is very simple: to reduce a matrix Mn×m (for graph data, we have n = m) to a matrix Nn×d where d ≪ n, we can simply multiply M with an random projection matrix Rm×d: N = M ⋅ R**.

      The matrix R is generated by sampling its entries independently from a random distribution. Specifically, FastRP considers the very sparse random projection method for dimension reduction, where entries of R are sampled from the following distribution:

      and s=sqrt(m).

      Very sparse random projection has superior computation efficiency as it only requires matrix multiplication, and (1-1/s) of the entries of R are zero.

      Additionally, Ultipa supports to initialize the matrix R with a combination with some selected numeric node properties (features). In this case, each row in matrix R is formed by concatenation of two parts:

      • Elements d1 ~ dx are generated by the Very Sparse Random Projection algorithm.
      • Elements p1 ~ py is a linear combination of the selected node property values.
      • x + y equals to the dimension of the final embeddings, where y is referred to as the property dimension.
      • The number of selected node features doesn't have to be the same as the property dimension.

      Node Similarity Matrix Construction

      FastRP keeps two key features when constructing node similarity matrix:

      1. Preserve high-order proximity in the graph
      2. Perform element-wise normalization on the matrix

      Let S be the adjacency matrix of graph G =(V,E), D be the degree matrix, and A be the transition matrix and A = D-1S. See the example below, the number on each edge represents the edge weight:

      The fact is that in matrix A, entry Aij stands for the probability of node i reaching to node j with 1-step random walk. Furthermore, if raises A to k-th power to get matrix Ak, then Akij denotes the probability of reaching node j from node i in k steps of random walk. Hence, matrix Ak preserves high-order proximity of the graph.

      However, as k → ∞, it can be proved that Akij → dj/2m where m = |E|, and dj is the degree of the j-th node. Since many real-world graphs are scale-free, the entries in Ak follow a heavy-tailed distribution. This implies that the pairwise distances between data points in Ak are predominantly influenced by columns with exceptionally large values, making them less meaningful and posing challenges for the next dimension reduction step.

      To address this, FastRP further performs normalization on matrix Ak by multiplying it with a diagonal normalization matrix L, where Lij = (dj/2m)β, and β is the normalization strength. The normalization strength controls the impact of node degree on the embeddings: when β is negative, the importance of high-degree neighbors is weakened, and the opposite when β is positive.

      Algorithm Process

      The FastRP algorithm follows these steps:

      1. Generate the random projection matrix R.
      2. Reduce the dimension of the normalized 1-step transition matrix: N1 = A ⋅ L ⋅ R
      3. Reduce the dimension of the normalized 2-step transition matrix: N2 = A ⋅ (A ⋅ L ⋅ R) = A ⋅ N1
      4. Repeat step 3 until the normalized k-step transition matrix: Nk = A ⋅ Nk-1
      5. Generate the weighted combination of different powers of A: N = α1N1 + ... + αkNk, where α1,...,αk are the iteration weights.

      The efficiency of FastRP also benefits from the associative property of matrix multiplication. It doesn't calculate Ak from scratch as the computation can be done iteratively.

      Considerations

      • The FastRP algorithm ignores the direction of edges but calculates them as undirected edges.

      Example Graph

      To create this graph:

      // Runs each row separately in order in an empty graphset
      create().node_property(@default, "level", uint32).node_property(@default, "maturity", float).node_property(@default, "degree", uint32).edge_property(@default, "score", float)
      insert().into(@default).nodes([{_id:"A", level:4, maturity:0.8, degree:12},{_id:"B", level:3, maturity:0.6, degree:20},{_id:"C", level:1, maturity:0.2, degree:5},{_id:"D", level:4, maturity:0.5, degree:18},{_id:"E", level:1, maturity:0.2, degree:25},{_id:"F", level:3, maturity:0.9, degree:3},{_id:"G", level:2, maturity:0.8, degree:6},{_id:"H", level:2, maturity:0.9, degree:33},{_id:"I", level:2, maturity:0.4, degree:2},{_id:"J", level:1, maturity:0.6, degree:10}])
      insert().into(@default).edges([{_from:"A", _to:"B", score:1}, {_from:"A", _to:"C", score:3}, {_from:"C", _to:"D", score:1.5}, {_from:"D", _to:"F", score:5}, {_from:"E", _to:"C", score:2.2}, {_from:"E", _to:"F", score:0.6}, {_from:"F", _to:"G", score:1.5}, {_from:"G", _to:"J", score:2}, {_from:"H", _to:"G", score:2.5}, {_from:"H", _to:"I", score:1}])
      

      Creating HDC Graph

      To load the entire graph to the HDC server hdc-server-1 as hdc_fastRP:

      CALL hdc.graph.create("hdc-server-1", "hdc_fastRP", {
        nodes: {"*": ["*"]},
        edges: {"*": ["*"]},
        direction: "undirected",
        load_id: true,
        update: "static",
        query: "query",
        default: false
      })
      

      hdc.graph.create("hdc_fastRP", {
        nodes: {"*": ["*"]},
        edges: {"*": ["*"]},
        direction: "undirected",
        load_id: true,
        update: "static",
        query: "query",
        default: false
      }).to("hdc-server-1")
      

      Parameters

      Algorithm name: fastRP

      Name
      Type
      Spec
      Default
      Optional
      Description
      dimension Integer ≥2 / No Dimensionality of the embeddings.
      normalizationStrength Float / / Yes The normalization strength β .
      iterationWeights []Float >0 [0.0,1.0,1.0] Yes Weight for each iteration, the size of the array is the number of iterations.
      edge_schema_property []"@<schema>?.<property>" / / Yes Numeric edge properties used as edge weights, summing values across the specified properties; edges without the specified properties are ignored.
      node_schema_property []"@<schema>?.<property>" / / Yes Numeric node properties used as features; propertyDimension and node_schema_property should be configured together or both unset.
      propertyDimension Integer (0,dimension] / Yes Length of the property dimension; propertyDimension and node_schema_property should be configured together or both unset.
      return_id_uuid String uuid, id, both uuid Yes Includes _uuid, _id, or both values to represent nodes in the results.
      limit Integer ≥-1 -1 Yes Limits the number of results returned; -1 includes all results.

      File Writeback

      CALL algo.fastRP.write("hdc_fastRP", {
        params: {
          return_id_uuid: "id",
          dimension: 5,
          normalizationStrength: 1.5,
          iterationWeights: [0, 0.8, 0.2, 1],
          edge_schema_property: 'score',
          node_schema_property: ['level', 'maturity', 'degree'],
          propertyDimension: 2
        },
        return_params: {
          file: {
            filename: "embeddings"
          }
        }
      })
      

      algo(fastRP).params({
        project: "hdc_fastRP",
        return_id_uuid: "id",
        dimension: 5,
        normalizationStrength: 1.5,
        iterationWeights: [0, 0.8, 0.2, 1],
        edge_schema_property: 'score',
        node_schema_property: ['level', 'maturity', 'degree'],
        propertyDimension: 2
      }).write({
        file:{
          filename: 'embeddings'
        }
      })
      

      _id,embedding_result
      J,-1.12587,-1.20511,-1.12587,0.079247,0.079247,
      D,-1.11528,-1.22975,-1.11528,0,0,
      F,-1.1547,-1.1547,-1.1547,0,0,
      H,-1.07893,-1.15311,-1.22729,0,0,
      B,-1.1547,-1.1547,-1.1547,0,0,
      A,-1.1547,-1.1547,-1.1547,0,0,
      E,-1.12083,-1.12083,-1.21962,0,0,
      C,-1.24807,-1.24807,-0.394019,-0.854049,0,
      I,-1.1547,-1.1547,-1.1547,0,0,
      G,-0.538663,-1.54159,-1.04013,-0.501465,0,
      

      DB Writeback

      Writes the embedding_result values from the results to the specified node property. The property type is float[].

      CALL algo.fastRP.write("hdc_fastRP", {
        params: {
          return_id_uuid: "id",
          dimension: 10,
          normalizationStrength: 1.5,
          iterationWeights: [0, 0.8, 0.2, 1],
          edge_schema_property: 'score',
          node_schema_property: ['level', 'maturity', 'degree'],
          propertyDimension: 2
        },
        return_params: {
          db: {
            property: "vector"
          }
        }
      })
      

      algo(fastRP).params({
        project: "hdc_fastRP",
        return_id_uuid: "id",
        dimension: 10,
        normalizationStrength: 1.5,
        iterationWeights: [0, 0.8, 0.2, 1],
        edge_schema_property: 'score',
        node_schema_property: ['level', 'maturity', 'degree'],
        propertyDimension: 2
      }).write({
        db: {
          property: "vector"
        }
      })
      

      Full Return

      CALL algo.fastRP("hdc_fastRP", {
        params: {
          return_id_uuid: "id",
          dimension: 3,
          normalizationStrength: 1.5,
          iterationWeights: [0, 0.8, 0.2, 0.3, 1]
        },
        return_params: {}
      }) YIELD embeddings
      RETURN embeddings
      

      exec{
        algo(fastRP).params({
          return_id_uuid: "id",
          dimension: 3,
          normalizationStrength: 1.5,
          iterationWeights: [0, 0.8, 0.2, 0.3, 1]
        }) as embeddings
        return embeddings
      } on hdc_fastRP
      

      Result:

      _id
      embedding_result
      J [0.0,2.299999952316284,0.0]
      D [0.0,-2.299999952316284,0.0]
      F [0.0,0.0,2.299999952316284]
      H [-2.299999952316284,0.0,0.0]
      B [-2.299999952316284,0.0,0.0]
      A [-1.6263456344604492,0.0,-1.6263456344604492]
      E [0.0,2.299999952316284,0.0]
      C [0.0,0.0,0.0]
      I [1.6263456344604492,1.6263456344604492,0.0]
      G [0.0,0.0,0.0]

      Stream Return

      CALL algo.fastRP("hdc_fastRP", {
        params: {
          return_id_uuid: "id",
          dimension: 4,
          normalizationStrength: 1.5,
          iterationWeights: [0, 0.8, 0.2, 0.3, 1]
        },
        return_params: {
         stream: {}
        }
      }) YIELD embeddings
      RETURN embeddings
      

      exec{
        algo(fastRP).params({
          return_id_uuid: "id",
          dimension: 4,
          normalizationStrength: 1.5,
          iterationWeights: [0, 0.8, 0.2, 0.3, 1]
        }).stream() as embeddings
        return embeddings
      } on hdc_fastRP
      

      Result:

      _id
      embedding_result
      J [0.0,-1.6263456344604492,0.0,-1.6263456344604492]
      D [1.6263456344604492,0.0,0.0,1.6263456344604492]
      F [0.0,0.0,-2.299999952316284,0.0]
      H [1.6263456344604492,0.0,0.0,-1.6263456344604492]
      B [0.0,-2.299999952316284,0.0,0.0]
      A [-1.6263456344604492,-1.6263456344604492,0.0,0.0]
      E [0.0,0.0,0.0,-2.299999952316284]
      C [0.0,1.6263456344604492,0.0,1.6263456344604492]
      I [0.0,0.0,0.0,0.0]
      G [-1.3279056549072266,0.0,-1.3279056549072266,1.3279056549072266]
      Please complete the following information to download this book
      *
      公司名称不能为空
      *
      公司邮箱必须填写
      *
      你的名字必须填写
      *
      你的电话必须填写