Change Password

Please enter the password.
Please enter the password. Between 8-64 characters. Not identical to your email address. Contain at least 3 of: uppercase, lowercase, numbers, and special characters.
Please enter the password.
Submit

Change Nickname

Current Nickname:
Submit

Apply New License

License Detail

Please complete this required field.

  • Ultipa Graph V4

Standalone

Please complete this required field.

Please complete this required field.

The MAC address of the server you want to deploy.

Please complete this required field.

Please complete this required field.

Cancel
Apply
ID
Product
Status
Cores
Applied Validity Period(days)
Effective Date
Excpired Date
Mac Address
Apply Comment
Review Comment
Close
Profile
  • Full Name:
  • Phone:
  • Company:
  • Company Email:
  • Country:
  • Language:
Change Password
Apply

You have no license application record.

Apply
Certificate Issued at Valid until Serial No. File
Serial No. Valid until File

Not having one? Apply now! >>>

Product Created On ID Amount (USD) Invoice
Product Created On ID Amount (USD) Invoice

No Invoice

v5.0
Search
    English
    v5.0

      Cosine Similarity

      HDC

      Overview

      In cosine similarity, data objects in a dataset are treated as vectors, and it uses the cosine value of the angle between two vectors to indicate the similarity between them. In the graph, specifying N numeric properties (features) of nodes to form N-dimensional vectors, two nodes are considered similar if their vectors are similar.

      Cosine similarity ranges from -1 to 1; 1 means that the two vectors have the same direction, -1 means that the two vectors have the opposite direction.

      In 2-dimensional space, the cosine similarity between vectors A(a1, a2) and B(b1, b2) is computed as:

      In 3-dimensional space, the cosine similarity between vectors A(a1, a2, a3) and B(b1, b2, b3) is computed as:

      The following diagram shows the relationship between vectors A and B in 2D and 3D spaces, as well as the angle θ between them:

      Generalize to N-dimensional space, the cosine similarity is computed as:

      Considerations

      • Theoretically, the calculation of cosine similarity between two nodes does not depend on their connectivity.
      • The value of cosine similarity is independent of the length of the vectors, but only the direction of the vectors.

      Example Graph

      To create this graph:

      // Runs each row separately in order in an empty graphset
      create().node_schema("product")
      create().node_property(@product, "price", int32).node_property(@product, "weight", int32).node_property(@product, "width", int32).node_property(@product, "height", int32)
      insert().into(@product).nodes([{_id:"product1", price:50, weight:160, width:20, height:152}, {_id:"product2", price:42, weight:90, width:30, height:90}, {_id:"product3", price:24, weight:50, width:55, height:70}, {_id:"product4", price:38, weight:20, width:32, height:66}])
      

      Creating HDC Graph

      To load the entire graph to the HDC server hdc-server-1 as hdc_sim_prop:

      CALL hdc.graph.create("hdc-server-1", "hdc_sim_prop", {
        nodes: {"*": ["*"]},
        edges: {"*": ["*"]},
        direction: "undirected",
        load_id: true,
        update: "static",
        query: "query",
        default: false
      })
      

      hdc.graph.create("hdc_sim_prop", {
        nodes: {"*": ["*"]},
        edges: {"*": ["*"]},
        direction: "undirected",
        load_id: true,
        update: "static",
        query: "query",
        default: false
      }).to("hdc-server-1")
      

      Parameters

      Algorithm name: similarity

      Name
      Type
      Spec
      Default
      Optional
      Description
      ids []_id / / No Specifies the first group of nodes for computation by their _id; computes for all nodes if it is unset.
      uuids []_uuid / / No Specifies the first group of nodes for computation by their _uuid; computes for all nodes if it is unset.
      ids2 []_id / / Yes Specifies the second group of nodes for computation by their _id; computes for all nodes if it is unset.
      uuids2 []_uuid / / Yes Specifies the second group of nodes for computation by their _uuid; computes for all nodes if it is unset.
      type String cosine cosine No Specifies the type of similarity to compute; for Cosine Similarity, keep it as cosine.
      node_schema_property []"<@schema.?><property>" / / No Numeric node properties to form a vector for each node; all specified properties must belong to the same label (schema).
      return_id_uuid String uuid, id, both uuid Yes Includes _uuid, _id, or both to represent nodes in the results.
      order String asc, desc / Yes Sorts the results by similarity.
      limit Integer ≥-1 -1 Yes Limits the number of results returned; -1 includes all results.
      top_limit Integer ≥-1 -1 Yes Limits the number of results returned for each node specified with ids/uuids in selection mode; -1 includes all results with a similarity greater than 0. This parameter is invalid in pairing mode.

      The algorithm has two calculation modes:

      1. Pairing: When both ids/uuids and ids2/uuids2 are configured, each node in ids/uuids is paired with each node in ids2/uuids2 (excluding self-pairing), and pairwise similarities are computed.
      2. Selection: When only ids/uuids is configured, pairwise similarities are computed between each target node and all other nodes in the graph. The results include all or a limited number of nodes with a similarity > 0 to the target node, ordered in descending similarity.

      File Writeback

      CALL algo.similarity.write("hdc_sim_prop", {
        params: {
          return_id_uuid: "id",
          ids: "product1",
          ids2: ["product2", "product3", "product4"],
          node_schema_property: ["price", "weight", "width", "height"],
          type: "cosine"
        },
        return_params: {
          file: {
            filename: "cosine"
          }
        }
      })
      

      algo(similarity).params({
        project: "hdc_sim_prop",
        return_id_uuid: "id",
        ids: "product1",
        ids2: ["product2", "product3", "product4"],
        node_schema_property: ["price", "weight", "width", "height"],
        type: "cosine"
      }).write({
        file: {
          filename: "cosine"
        }
      })
      

      Result:

      _id1,_id2,similarity
      product1,product2,0.986529
      product1,product3,0.878858
      product1,product4,0.816876
      

      Full Return

      CALL algo.similarity("hdc_sim_prop", {
        params: {
          return_id_uuid: "id",
          ids: ["product1","product2"], 
          ids2: ["product2","product3","product4"],
          node_schema_property: ["price", "weight", "width", "height"],
          type: "cosine"
        },
        return_params: {}
      }) YIELD cs
      RETURN cs
      

      exec{
        algo(similarity).params({
          return_id_uuid: "id",
          ids: ["product1","product2"], 
          ids2: ["product2","product3","product4"],
          node_schema_property: ["price", "weight", "width", "height"],
          type: "cosine"
        }) as cs
        return cs
      } on hdc_sim_prop
      

      Result:

      _id1 _id2 similarity
      product1 product2 0.986529
      product1 product3 0.878858
      product1 product4 0.816876
      product2 product3 0.934217
      product2 product4 0.881988

      Stream Return

      CALL algo.similarity("hdc_sim_prop", {
        params: {
          return_id_uuid: "id",
          ids: ["product1", "product3"], 
          node_schema_property: ["price", "weight", "width", "height"],
          type: "cosine",
          top_limit: 1    
        },
        return_params: {
        	stream: {}
        }
      }) YIELD top
      RETURN top
      

      exec{
        algo(similarity).params({
          return_id_uuid: "id",
          ids: ["product1", "product3"], 
          node_schema_property: ["price", "weight", "width", "height"],
          type: "cosine",
          top_limit: 1        
        }).stream() as cs
        return cs
      } on hdc_sim_prop
      

      Result:

      _id1 _id2 similarity
      product1 product2 0.883292
      product3 product2 0.877834
      Please complete the following information to download this book
      *
      公司名称不能为空
      *
      公司邮箱必须填写
      *
      你的名字必须填写
      *
      你的电话必须填写