In 2011 many computer users were exploring the opportunities and the benefits of the massive parallelism offered by heterogeneous computing. In 2000 the Khronos Group, a not-for-profit industry consortium, was founded to create standard open APIs for parallel computing, graphics and dynamic media. Among them has been OpenCL, an open system for programming heterogeneous computers with components made by multiple manufacturers. This publication explains how heterogeneous computers work and how to program them using OpenCL. It also describes how to combine OpenCL with OpenGL for displaying graphical effects in real time. Chapter 1 describes briefly two older de facto standard and highly successful parallel programming systems: MPI and OpenMP. Collectively, the MPI, OpenMP, and OpenCL systems cover programming of all major parallel architectures: clusters, shared-memory computers, and the newest heterogeneous computers. Chapter 2, the technical core of the book, deals with OpenCL fundamentals: programming, hardware, and the interaction between them. Chapter 3 adds important information about such advanced issues as double-versus-single arithmetic precision, efficiency, memory use, and debugging. Chapters 2 and 3 contain several examples of code and one case study on genetic algorithms. These examples are related to linear algebra operations, which are very common in scientific, industrial, and business applications. Most of the book’s examples can be found on the enclosed CD, which also contains basic projects for Visual Studio, MinGW, and GCC. This supplementary material will assist the reader in getting a quick start on OpenCL projects.
This book contains the most important and essential information required for designing correct and efficient OpenCL programs. Some details have been omitted but can be found in the provided references. The authors assume that readers are familiar with basic concepts of parallel computation, have some programming experience with C or C++ and have a fundamental understanding of computer architecture. In the book, all terms, definitions and function signatures have been copied from official API documents available on the page of the OpenCL standards creators.
The book was written in 2011, when OpenCL was in transition from its infancy to maturity as a practical programming tool for solving real-life problems in science and engineering. Earlier, the Khronos Group successfully defined OpenCL specifications, and several companies developed stable OpenCL implementations ready for learning and testing. A significant contribution to programming heterogeneous computers was made by NVIDIA which created one of the first working systems for programming massively parallel computers – CUDA. OpenCL has borrowed from CUDA several key concepts. At this time (fall 2011), one can install OpenCL on a heterogeneous computer and perform meaningful computing experiments. Since OpenCL is relatively new, there are not many experienced users or sources of practical information. One can find on the Web some helpful publications about OpenCL, but there is still a shortage of complete descriptions of the system suitable for students and potential users from the scientific and engineering application communities.
Chapter 1 provides short but realistic examples of codes using MPI and OpenMP in order for readers to compare these two mature and very successful systems with the fledgling OpenCL. MPI used for programming clusters and OpenMP for shared memory computers, have achieved remarkable worldwide success for several reasons. Both have been designed by groups of parallel computing specialists that perfectly understood scientific and engineering applications and software development tools. Both MPI and OpenMP are very compact and easy to learn. Our experience indicates that it is possible to teach scientists or students whose disciplines are other than computer science how to use MPI and OpenMP in a several hours time. We hope that OpenCL will benefit from this experience and achieve, in the near future, a similar success.
Paraphrasing the wisdom of Albert Einstein, we need to simplify OpenCL as much as possible but not more. The reader should keep in mind that OpenCL will be evolving and that pioneer users always have to pay an additional price in terms of initially longer program development time and suboptimal performance before they gain experience. The goal of achieving simplicity for OpenCL programming requires an additional comment. OpenCL supporting heterogeneous computing offers us opportunities to select diverse parallel processing devices manufactured by different vendors in order to achieve near-optimal or optimal performance. We can select multi-core CPUs, GPUs, FPGAs and other parallel processing devices to fit the problem we want to solve. This flexibility is welcomed by many users of HPC technology, but it has a price.
Programming heterogeneous computers is somewhat more complicated than writing programs in conventional MPI and OpenMP. We hope this gap will disappear as OpenCL matures and is universally used for solving large scientific and engineering problems.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 email@example.com
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 firstname.lastname@example.org