With the widely adoption of Hadoop-based big data platfom in industry, monitoring and maintaining Hadoop clusters has become increasingly critical, which also brings high demands on automatic or semi-automatic health checking and diagonostics. Anomaly detection plays an essential role for monitoring Hadoop clusters and provides basis for automatic alerts generation. Most existing methods for anomaly detection for Hadoop clusters are rule-based and relies on characterizing of anomalies with domain-knowledge and understanding of the specific target system, which bring barriers for practical application in complex business platform. In this paper, we leverage statistical analysis to discover critical components in execution data collected through Hadoop JobTracker. A two-stage anomaly detection method is proposed by integrating PCA and DBSCAN. This new method has no requirements on prior knowledge of target nodes and jobs. Experiments are conducted to detect buggy jobs based on real execution data collected from three Hadoop clusters provided by one of the largest E-Commerce company in the world. The results show our method is effective to detect anomalies without specifying the features of anomalies in advance.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com