Pairwise similarity between data points is usually computed in the traditional clustering methods. But in many cases, especially for high dimensional data in computer vision, it is required that more than two data points should be involved in representing the similarities. In this case, hypergraph clustering is an ideal tool for data analysis, where high order similarities on the data subsets, represented by hyperedges, can reflect the similarity among more than two data points. Hypergraph clustering usually includes hypergraph construction and hypergraph partition. Two important questions in hypergraph construction are how to generate the hyperedges and how many hyperedges should be used to represent the original data. Recently, Pulak Purkait et al. have proposed a method for generating the large pure hyperedges, which is proved to be more effective than the traditional methods for computer vision tasks. However, the method needs a specified number of hyperedges in advance, and uses random sampling to generate hyperedges, which may lead to suboptimal clustering results. Therefore, a novel sampling method called greedy neighborhood search is proposed in this work, which generates large pure hyperedges based on Shared Reverse k Nearest Neighbors (SRNN) and learns the number of hyperedges simultaneously. Experiments show the benefits of applying the proposed method on high dimensional data.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com