Parallelisation of Block-Recursive Matrix Multiplication in Prefix Computations

Bader, Michael; Hanigk, Sebastian; Huckle, Thomas

Abstract

We present a performance and scalability study on the parallelisation of a series of matrix multiplications within a quantum control problem. As a time-critical sub-task, all matrix products A₁A_k for matrices A₁,…,A_m have to be computed (prefix problem). The parallelisation can either be coarse-grain, where parallel prefix computation is used to concurrently execute the individual matrix multiplications; however, there is also the obvious fine-grain parallelisation approach to parallelise the subsequent matrix multiplications, and mixtures of these two principal approaches are possible. We compare two parallel multiplication algorithms that are based on block-structuring – SRUMMA and a modified, block-recursive approach based on Peano space-filling curves – and study their efficiency on a compute cluster with 128 processors. The Peano approach proves to be advantageous especially for moderately sized matrices. Hence, we compare the Peano algorithm's performance for coarse-grain, fine-grain, and hybrid parallelisation of the prefix problem. The Peano algorithm proves to be scalable enough to allow a purely fine-grain parallelisation of the prefix problem.

This website uses cookies

This website uses cookies