Subgraph counting aims to count the number of occurrences of a subgraph T (aka as a template) in a given graph G. The basic problem has found applications in diverse domains. The problem is known to be computationally challenging – the complexity grows both as a function of T and G. Recent applications have motivated solving such problems on massive networks with billions of vertices.
In this chapter, we study the subgraph counting problem from a parallel computing perspective. We discuss efficient parallel algorithms for approximately resolving subgraph counting problems by using the color-coding technique. We then present several system-level strategies to substantially improve the overall performance of the algorithm in massive subgraph counting problems. We propose: 1) a novel pipelined Adaptive-Group communication pattern to improve inter-node scalability, 2) a fine-grained pipeline design to effectively reduce the memory space of intermediate results, 3) partitioning neighbor lists of subgraph vertices to achieve better thread concurrency and workload balance. Experimentation on an Intel Xeon E5 cluster shows that our implementation achieves 5x speedup of performance compared to the state-of-the-art work while reduces the peak memory utilization by a factor of 2 on large templates of 12 to 15 vertices and input graphs of 2 to 5 billions of edges.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com