Recently, Business Email Compromise (BEC) has become a big issue. Some security companies and organizations warn about BEC and say that we must defend against them. Although we have many SPAM filters, we have very few BEC filters. We have to find BEC by ourselves. One of the features of BEC is that the wording and style in BEC differ usual. If a software finds this point, it can help us to defend against BEC. Based on this idea, we propose a method to identify an email author using machine learning algorithms. In this approach, we make identification models from emails received in the past. We defined a target person in advance and use machine learning algorithms to make models which identify whether an email is sent by this person or not. We translate an email to a feature vector which consists of the similarity between the subject and the body, the distribution of Part of Speech and the occurrence of terms in the beginning part of the body. We make models from these feature vectors using machine learning algorithms, KNN, SVM, NBC and Decision Tree. And we try to identify whether a target person write a new email or not, with these models. We evaluated these approaches using open dataset and tools. The best accuracy is about 0.84 and the best Kappa statistics value is 0.68, therefore, our approach shows good agreement. However, we can get the better Kappa statistics value using a simple method. That is, we could not show the advantage of our approach. Overfitting is one of the reasons why our approach could not be better than an existing approach. We have to modify this weakness using literature resources and other approaches. Moreover, we have to evaluate our new approach using the bigger dataset. These are our future work.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com