Subgraph Search for Dynamic Graphs
MetadataShow full item record
Subgraph search is the problem of searching a data graph for the occurrences of another graph, typically referred to as the query or pattern graph. This thesis is dedicated to studying a specific class of subgraph search, named subgraph isomorphism for dynamic graphs, i.e., graphs that are evolving over time. Subgraph isomorphism is a well studied NP-complete problem in computer science. It requires finding a bijective mapping between the vertices in the query graph and the data graph so that if two vertices are neighbors in the query graph, their mapped counterparts are neighbors in the data graph as well. Our research on dynamic graphs is motivated by large-scale graph data sources such as social media and cyber-security where applications continuously produce prodigious amounts of data. This thesis makes three major contributions. We begin with proposing a new subgraph isomorphism algorithm for dynamic graphs and a novel data structure, namely the Subgraph Join-Tree (SJ-Tree) to support the algorithm. Next, we demonstrate how the statistics of the graph stream can be leveraged to produce the best-performing SJ-Tree for a given query graph. We propose a metric for estimating the selectivity of a graph query and demonstrate its use to reason about the relative performance of different query execution strategies. Our observations are supported by experiments on multiple real-world data sources drawn from online news, social media and network traffic. Our experiments demonstrate speedups by 10-100 times over existing approaches. Finally, we propose a new algorithm and supporting data structures to implement the aforementioned dynamic graph search algorithm on a distributed system. As a secondary contribution, we demonstrate how the novel ideas from the graph search paradigm can be applied to discover patterns in dynamic graphs. The thesis concludes with presentation of real-world use cases from three different application domains.