Change Password

Please enter the password.
Please enter the password. Between 8-64 characters. Not identical to your email address. Contain at least 3 of: uppercase, lowercase, numbers, and special characters.
Please enter the password.
Submit

Change Nickname

Current Nickname:
Submit

Apply New License

License Detail

Please complete this required field.

  • Ultipa Graph V4

Standalone

Please complete this required field.

Please complete this required field.

The MAC address of the server you want to deploy.

Please complete this required field.

Please complete this required field.

Cancel
Apply
ID
Product
Status
Cores
Applied Validity Period(days)
Effective Date
Excpired Date
Mac Address
Apply Comment
Review Comment
Close
Profile
  • Full Name:
  • Phone:
  • Company:
  • Company Email:
  • Country:
  • Language:
Change Password
Apply

You have no license application record.

Apply
Certificate Issued at Valid until Serial No. File
Serial No. Valid until File

Not having one? Apply now! >>>

Product Created On ID Amount (USD) Invoice
Product Created On ID Amount (USD) Invoice

No Invoice

v5.0
Search
    English
    v5.0

      TextRank

      HDC

      Overview

      TextRank, derived from PageRank, is a graph-based ranking model for text processing. It can be applied to various natural language processing tasks such as keyword extraction, keyphrase extraction, and text summarization.

      Concepts

      Text as a Graph

      To apply the TextRank algorithm, the text must first be represented as a graph .The structure of the graph depends on the specific application:

      • Nodes: Text units that best fit the task, such as words, collocations, or sentences, are added as nodes in the graph.
      • Edges: Relationships between text units, such as semantic similarity, co-occurrence, or contextual overlap, are used to connect nodes with edges.

      Sample graph build for keyphrase extraction: nodes are selected lexical units from the text, and edges are established based on co-occurrence within a defined window of words (Source: Original paper)

      TextRank Model

      The ranks of all text units are computed recursively using a "recommendation" mechanism, similar to the PageRank algorithm. However, TextRank incorporates edge weights, using a modified formula to integrate these weights effectively:

      where,

      • Out(v) is the set of nodes that node v points to;
      • wvu is the edge weight between nodes v and u;
      • d is the damping factor.

      Considerations

      • The rank of isolated text unit will stay the same as the value of (1 - d).
      • Self-loop is regarded as a successor and a predecessor, a node would pass some rank to itself through self-loop. If a network has many self-loops, it will take more iterations to converge.

      Example Graph

      To create this graph:

      // Runs each row separately in order in an empty graphset
      create().edge_property(@default, "weight", int32)
      insert().into(@default).nodes([{_id:"A"}, {_id:"B"}, {_id:"C"}, {_id:"D"}, {_id:"E"}, {_id:"F"}, {_id:"G"}])
      insert().into(@default).edges([{_from:"A", _to:"E", weight:3}, {_from:"B", _to:"A", weight:3}, {_from:"B", _to:"E", weight:2}, {_from:"C", _to:"A", weight:1}, {_from:"C", _to:"D", weight:4}, {_from:"D", _to:"E", weight:5}, {_from:"E", _to:"G", weight:2}, {_from:"F", _to:"B", weight:1}, {_from:"F", _to:"G", weight:3}])
      

      Creating HDC Graph

      To load the entire graph to the HDC server hdc-server-1 as hdc_textrank:

      CALL hdc.graph.create("hdc-server-1", "hdc_textrank", {
        nodes: {"*": ["*"]},
        edges: {"*": ["*"]},
        direction: "undirected",
        load_id: true,
        update: "static",
        query: "query",
        default: false
      })
      

      hdc.graph.create("hdc_textrank", {
        nodes: {"*": ["*"]},
        edges: {"*": ["*"]},
        direction: "undirected",
        load_id: true,
        update: "static",
        query: "query",
        default: false
      }).to("hdc-server-1")
      

      Parameters

      Algorithm name: text_rank

      Name
      Type
      Spec
      Default
      Optional
      Description
      init_value Float >0 0.2 Yes The initial rank assigned to all nodes.
      loop_num Integer ≥1 5 Yes The maximum number of iteration rounds. The algorithm will terminate after completing all rounds.
      damping Float (0,1) 0.8 Yes The damping factor.
      max_change Float ≥0 0 Yes The algorithm terminates when the changes in all ranks between iterations are less than the specified max_change, indicating that the result is stable. Sets to 0 to disable this criterion.
      edge_schema_property []"<@schema.?><property>" / / No Numeric edge properties as weights, summing values across the specified properties; edges without the specified properties are ignored.
      return_id_uuid String uuid, id, both uuid Yes Includes _uuid, _id, or both values to represent nodes in the results.
      limit Integer ≥-1 -1 Yes Limits the number of results returned; -1 includes all results.
      order String asc, desc / Yes Sorts the results by rank.

      File Writeback

      CALL algo.text_rank.write("hdc_textrank", {
        params: {
          return_id_uuid: "id",
          init_value: 1,
          loop_num: 50,
          damping: 0.8,
          edge_schema_property: "weight",
          order: 'desc'
        },
        return_params: {
          file: {
            filename: "textrank"
          }
        }
      })
      

      algo(text_rank).params({
        project: "hdc_textrank",
        return_id_uuid: "id",
        init_value: 1,
        loop_num: 50,
        damping: 0.8,
        edge_schema_property: "weight",
        order: 'desc'
      }).write({
        file: {
          filename: "textrank"
        }
      })
      

      Result:

      _id,text_rank
      G,0.973568
      E,0.81696
      A,0.3472
      D,0.328
      B,0.24
      F,0.2
      C,0.2
      

      DB Writeback

      Writes the text_rank values from the results to the specified node property. The property type is float.

      CALL algo.text_rank.write("hdc_textrank", {
        params: {
          loop_num: 50,
          edge_schema_property: "@default.weight"
        },
        return_params: {
          db: {
            property: "rank"
          }
        }
      })
      

      algo(text_rank).params({
        project: "hdc_textrank",
        loop_num: 50,
        edge_schema_property: "@default.weight"
      }).write({
        db:{ 
          property: 'rank'
        }
      })
      

      Full Return

      CALL algo.text_rank("hdc_textrank", {
        params: {
          return_id_uuid: "id",    
          init_value: 1,
          loop_num: 50,
          damping: 0.8,
          edge_schema_property: "weight",
          order: 'desc',
          limit: 5
        },
        return_params: {}
      }) YIELD TR
      RETURN TR
      

      exec{
        algo(text_rank).params({
          return_id_uuid: "id",    
          init_value: 1,
          loop_num: 50,
          damping: 0.8,
          edge_schema_property: "weight",
          order: 'desc',
          limit: 5
        }) as TR
        return TR
      } on hdc_textrank
      

      Result:

      _id text_rank
      G 0.973568
      E 0.81696
      A 0.3472
      D 0.328
      B 0.24

      Stream Return

      CALL algo.text_rank("hdc_textrank", {
        params: {
          return_id_uuid: "id",
          loop_num: 50,
          damping: 0.8,
          edge_schema_property: "weight",
          order: 'desc',
          limit: 5
        },
        return_params: {
        	stream: {}
        }
      }) YIELD TR
      RETURN TR
      

      exec{
        algo(text_rank).params({
          return_id_uuid: "id",
          loop_num: 50,
          damping: 0.8,
          edge_schema_property: "weight",
          order: 'desc',
          limit: 5
        }).stream() as TR
        return TR
      } on hdc_textrank
      

      Result:

      _id text_rank
      G 0.973568
      E 0.81696
      A 0.3472
      D 0.328
      B 0.24
      Please complete the following information to download this book
      *
      公司名称不能为空
      *
      公司邮箱必须填写
      *
      你的名字必须填写
      *
      你的电话必须填写