As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
Today's state-of-the-art cluster supercomputers include commodity components such as multi-core CPUs and graphics processing units. Together, these hardware devices provide unprecendented levels of performance in terms of raw GFLOPS and GFLOPS/cost. High-performance computing applications are always in search of lower execution times, greater system utilization, and better efficiency, which means that developers will need to leverage these disruptive technologies in order to take advantage of modern cluster computers' full potential processing power. New application models and middleware systems are needed to ease the developer's task of writing programs which efficiently use this processing capability. Here, we present the implementation of a biomedical image analysis application which serves as a case-study for the development of applications for modern heterogeneous supercomputers. We present detailed application-specific optimizations which we generalize and combine with new programming models into a blueprint for future application development. Our techniques show good success executing on a modern heterogeneous GPU cluster providing 10 TFLOPS of peak processing capability.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.