In this paper, we propose a data placement strategy to deal with the imbalanced workload problem on DataNodes. Basing on computing capability of each node in a heterogeneous Hadoop cluster, the proposed strategy can balance the data that was stored in the DataNode such that the cost of data transfer time can be tremendously reduced. As a result, the Hadoop overall performance can be greatly improved. Experimental results demonstrate that the proposed data placement strategy can highly decrease the execution time and thus improves Hadoop performance in a heterogeneous cluster.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com