Generalized GEMM Kernels on GPGPUs: Experiments and Applications

Barbieri, Davide; Cardellini, Valeria; Filippone, Salvatore

doi:10.3233/978-1-60750-530-3-307

Abstract

General purpose computing on graphics processing units (GPGPU) is fast becoming a common feature of high performance computing centers. In this paper we discuss some implementation issues related to dense linear algebra computations on GPUs, such as the GEneral Matrix-Matrix product, as well as other kernels sharing the same computational pattern, such as the matrix form of the All-Pairs Shortest-Path problem. Our CUDA implementation has shown a significant improvement on the NVIDIA processing units over the vendor's software. We review the optimization techniques that can be employed to implement such operations, as well as outline further development work in connected application domains.

This website uses cookies

This website uses cookies