Exploiting network data (i.e., graphs) is a rather particular case of data mining. The size and relevance of network domains justifies research on graph mining, but also brings forth severe complications. Computational aspects like scalability and parallelism have to be reevaluated, and well as certain aspects of the data mining process. One of those are the methodologies used to evaluate graph mining methods, particularly when processing large graphs. In this paper we focus on the evaluation of a graph mining task known as Link Prediction. First we explore the available solutions in traditional data mining for that purpose, discussing which methods are most appropriate. Once those are identified, we argue about their capabilities and limitations for producing a faithful and useful evaluation. Finally, we introduce a novel modification to a traditional evaluation methodology with the goal of adapting it to the problem of Link Prediction on large graphs.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com