As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
We consider the general problem of generating code for the automated selection of the expected best implementation variants for multiple subcomputations on a heterogeneous multicore system, where the program's control flow between the subcomputations is structured by sequencing and loops. A naive greedy approach as applied in previous works on multi-variant selection code generation would determine the locally best variant for each subcomputation instance but might miss globally better solutions. We present a formalization and a fast algorithm for the global variant selection problem for loop-based programs. We also show that loop unrolling can additionally improve performance, and prove an upper bound of the unroll factor which allows to keep the run-time space overhead for the variant-dispatch data structure low. We evaluate our method in case studies using an ARM big.LITTLE based system and a GPU based system where we consider optimization for both energy and performance.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.