Reducing the number of alerts and anomalies has been the focus of several studies, but an automated anomaly detection using log files is still an ongoing challenge. One of the pertinent challenges in the detection of anomalies using log files is dealing with ‘unlabelled’ data. In the existing approaches, there is a lack of anomalous examples and that log anomalies can have many different patterns. One solution is to label the data manually, but this can be a tedious task as the data size could be very large and log files are not easily understandable. In this paper, we have presented an automated anomaly detection model that combines supervised and unsupervised machine learning with domain knowledge. Our method reduces the number of alerts by accurately predicting anomalous log events based on domain expertise, which is used to create automated rules that allow generating a labelled dataset from unlabelled log records, which are unstructured and present in many different formats. This labelled dataset is then used to train a classification model that will help predict anomalous log events. Our results show that we can accurately predict anomalous and non-anomalous events with an average accuracy of 98%. Our approach offers a practical solution for systems where logs are collected without any labelling, making it difficult to create an accurate model to identify anomalous log records. The methodology presented is very fast and efficient, which can provide real-time anomaly detection for time critical environments.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com