Locality Optimization on a NUMA Architecture for Hybrid LU Factorization

R&#233;my, Adrien; Baboulin, Marc; Sosonkina, Masha; Rozoy, Brigitte

doi:10.3233/978-1-61499-381-0-153

Locality Optimization on a NUMA Architecture for Hybrid LU Factorization

Authors

Adrien Rémy, Marc Baboulin, Masha Sosonkina, Brigitte Rozoy

Pages

153 - 162

DOI

10.3233/978-1-61499-381-0-153

Series

Advances in Parallel Computing

Ebook

Volume 25: Parallel Computing: Accelerating Computational Science and Engineering (CSE)

Abstract

We study the impact of non-uniform memory accesses (NUMA) on the solution of dense general linear systems using an LU factorization algorithm. In particular we illustrate how an appropriate placement of the threads and memory on a NUMA architecture can improve the performance of the panel factorization and consequently accelerate the global LU factorization. We apply these placement strategies and present performance results for a hybrid multicore/GPU LU algorithm as it is implemented in the public domain library MAGMA.

This website uses cookies

This website uses cookies