On a “Sandy Bridge E5-2670” CPU, we compare different performance metrics (speed, energy, cache traffic) of two multi-threaded Sparse Matrix-Vector Multiplication implementations: Intel MKL's CSR and librsb's RSB. We find out primarily that the benefit of RSB over MKL's CSR varies substantially depending on the numerical type (12–40%). We also confirm other studies' results in that energy savings are strongly correlated with fast code, and that memory traffic minimization is the primary performance factor.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 email@example.com
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 firstname.lastname@example.org