

The general-purpose graphic processing unit (GPGPU) is a popular accelerator for general applications such as scientific computing because the applications are massively parallel and the significant power of parallel computing inheriting from GPUs. However, distributing workload among the large number of cores as the execution configuration in a GPGPU is currently still a manual trial-and-error process. Programmers try out manually some configurations and might settle for a sub-optimal one leading to poor performance and/or high power consumption. The state-of-the-art methods for addressing this issue are mainly based on the heavy profiling of computation kernels. This paper presents an auto-tuning approach for GPGPU applications with the performance and power models. First, a model-based analytic approach for estimating performance and power consumption of kernels is proposed. Second, an auto-tuning framework is proposed for automatically obtaining a near-optimal configuration for a kernel computation. In this work, we formulate the problem as the constraint optimization and solve it using the optimization algorithms. Experimental results show the fidelity of our proposed models for performance and energy consumption, and prove the rationality of adopting the auto-tuning method with the models for minimizing overhead. Further, the efficiency of tuning procedure and the quality of outcome configuration in our proposed method show the superiorities over the previous methods.