The performance of multi-threaded applications depends on efficient scheduling of parallel tasks. Manually selecting schedulers is difficult because the best scheduler depends on the application, machine and input. We present a frame-work that automatically selects the best scheduler based on empirical tuning results. We applied our framework to tune eleven applications parallelized using OpenMP, TBB or the Galois system. Depending on the application and machine, we observed up to 4X performance improvement over the default scheduler. We were also able to prune the search space by an order of magnitude while still achieving performance within 16% of the best scheduler.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com