Modeling and mining multi-aspect graphs with scalable streaming tensor decomposition

Graphs emerge in almost every real-world application domain, ranging from online social networks all the way to health data and movie viewership patterns. Typically, such real-world graphs are big and dynamic, in the sense that they evolve over time.

Furthermore, graphs usually contain multi-aspect information i.e., in a social network, we can have the “means of communication” between nodes, such as who messages whom, who calls whom, who comments on whose timeline, and so on.

How can we model and mine useful patterns, such as communities of nodes in that graph, from such multi-aspect graphs? How can we identify dynamic patterns in those graphs, and how can we deal with streaming data, when the volume of data to be processed is very large?

In order to answer these questions, In Dr. Ekta Gujral’s thesis, she proposed novel tensor-based methods for mining static and dynamic multi-aspect graphs.

In general, a tensor is a higher-order generalization of a matrix that can represent high-dimensional multi-aspect data such as time-evolving networks, collaboration networks, and Spatio-temporal data like Electroencephalography (EEG) brain measurements.

Dr. Gujral’s thesis is organized in two synergistic thrusts:

  • First, she focused on static multi-aspect graphs, where the goal is to identify coherent communities and patterns between nodes by leveraging the tensor structure in the data.
  • Second, as graphs evolve dynamically, she focused on handling such streaming updates in the data without having to re-compute the decomposition, but incrementally updating the existing results.

Dr. Gujral’s thesis focused on static multi-aspect data, in which, static network information is available.

She detailed her proposed algorithms spanning the topics of community detection in semi-supervised matrix-tensor coupling settings and structurally dynamic multi-aspect graph summarization through tensor analysis.

Dr. Gujral’s proposed method SMACD algorithm based on the CP decomposition is the best to incorporate multi-aspect graph information and semi-supervision while being able to discover overlapping and non-overlapping communities in social networks.

In addition to the above, her thesis posed and studied the niche detection problem, which imposes an explainable lens on the classical problem of co-clustering interactions.

Next, Dr. Gujral expands the scope to incremental multi-aspect data, in which she leverages network information over time. In a wide array of modern real-world applications, the data is far from static.

As the volume and velocity of data grow, the need for time and space-efficient online tensor decomposition is imperative. Unfortunately, this is an NP-Hard problem, and especially in the streaming case, a lot of challenges make it more complex and non-trivial.

For more insights and details on online tensor decomposition check out the full thesis content on Escholarship.org.