Resource type
Thesis type
(Thesis) M.Sc.
Date created
2023-12-01
Authors/Contributors
Author (aut): Che, Joanna
Abstract
Graph mining workloads search for subgraphs of interest in large graphs. This often involves finding subgraphs with containment constraints such that the subgraph does not contain a smaller subgraph (minimality) or the subgraph is not part of a larger subgraph (maximality). Existing graph mining systems employ efficient task-parallel strategies to quickly explore subgraphs of interest, however they remain oblivious to containment constraints. Hence, graph mining systems require expensive constraint checking on every explored match as well as redundant explorations that limit their scalability. In this work, we develop a novel exploration model for mining subgraphs with containment constraints. First, we identify the impact of constraints using different types of dependencies (successor, predecessor, and lateral) between the subgraphs of interest. Then, we develop four key task management strategies to exploit these dependencies: (a) task fusion to merge correlated tasks for increasing cache reuse; (b) task promotion to allow continuous explorations from available subgraphs and skip re-exploring subgraphs from scratch; (c) task cancellations to avoid unnecessary constraint checking and prioritizes faster constraint validations; and (d) task skipping to safely skip certain exploration and validation tasks. Finally, we extensively evaluate our model on real-world graphs and applications. Our task management strategies efficiently compute graph mining queries with containment constraints and our exploration model scales to very large graph mining workloads that could not be handled with prior approaches.
Document
Extent
48 pages.
Identifier
etd22864
Copyright statement
Copyright is held by the author(s).
Supervisor or Senior Supervisor
Thesis advisor (ths): Vora, Keval
Language
English
Member of collection
Download file | Size |
---|---|
etd22864.pdf | 1.13 MB |