The potential of FPGAs for High-Performance Computing is increasingly recognized, but most work focuses on acceleration of small, isolated kernels. We present a parallel FPGA implementation of a legacy algorithm, the seminal scheme for cumulus convection in large-scale models developed by Emanuel . Our design makes use of pipelines both at the arithmetic and at the logical stage level, keeping the entire algorithm on the FPGA. We assert that modern FPGAs have the resources to support this type of large algorithms. Through a practical and theoretical evaluation of our design we show how such an FPGA implementation compares to GPU implementations or multi-core approaches such as OpenMP.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com