As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
General purpose computing on graphics processing units (GPGPU) is fast becoming a common feature of high performance computing centers. In this paper we discuss some implementation issues related to dense linear algebra computations on GPUs, such as the GEneral Matrix-Matrix product, as well as other kernels sharing the same computational pattern, such as the matrix form of the All-Pairs Shortest-Path problem. Our CUDA implementation has shown a significant improvement on the NVIDIA processing units over the vendor's software. We review the optimization techniques that can be employed to implement such operations, as well as outline further development work in connected application domains.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.