Mining Patterns in Graphs with Multiple Weights

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

  • Giulia Preti, University of Trento, Italy
  • Matteo Lissandrini, Aalborg University, Denmark
  • Davide Mottin
  • Yannis Velegrakis, University of Trento, Italy
Graph pattern mining aims at identifying structures that appear frequently in large graphs, under the assumption that frequency signifies importance.
In real life, there are many graphs with weights on nodes and/or edges.
For these graphs, it is fair that the importance (score) of a pattern is determined not only by the number of its appearances, but also by the weights on the nodes/edges of those appearances.
Scoring functions based on the weights do not generally satisfy the apriori property, which guarantees that the number of appearances of a pattern cannot be larger than the frequency of any of its sub-patterns, and hence allow faster pruning.
Therefore, existing approaches employ other, less efficient, pruning strategies.
The problem becomes even more challenging in the case of multiple weighting functions that assign different weights to the same nodes/edges.
In this work we propose a new family of scoring functions that respects the apriori property, and thus can rely on effective pruning strategies.
We provide efficient and effective techniques for mining patterns in multi-weight graphs, and we devise both an exact and an approximate solution.
In addition, we propose a distributed version of our approach, which distributes the appearances of the patterns to examine among multiple workers.
Extensive experiments on both real and synthetic datasets prove that the presence of edge weights and the choice of scoring function affect the patterns mined, and the quality of the results returned to the user.
Moreover, we show that, even when the performance of the exact algorithm degrades because of an increasing number of weighting functions, the approximate algorithm performs well and with fairly good quality.
Finally, the distributed algorithm proves to be the best choice for mining large and rich input graphs
Original languageEnglish
JournalDistributed and Parallel Databases
VolumeSpecial Issue
Number of pages39
ISSN0926-8782
DOIs
Publication statusPublished - 2019

    Research areas

  • Graph mining, Multi-weighted graphs, Personalized patterns, Weighted pattern mining

See relations at Aarhus University Citationformats

ID: 142989230