Hypergraph Clustering by Generating Large Pure Hyperedges Using Greedy Neighborhood Search

Ran, Xingcheng; Lu, Yonggang; Wang, Xiangwen; Lu, Zhenyu

doi:10.3233/FAIA190176

Abstract

Pairwise similarity between data points is usually computed in the traditional clustering methods. But in many cases, especially for high dimensional data in computer vision, it is required that more than two data points should be involved in representing the similarities. In this case, hypergraph clustering is an ideal tool for data analysis, where high order similarities on the data subsets, represented by hyperedges, can reflect the similarity among more than two data points. Hypergraph clustering usually includes hypergraph construction and hypergraph partition. Two important questions in hypergraph construction are how to generate the hyperedges and how many hyperedges should be used to represent the original data. Recently, Pulak Purkait et al. have proposed a method for generating the large pure hyperedges, which is proved to be more effective than the traditional methods for computer vision tasks. However, the method needs a specified number of hyperedges in advance, and uses random sampling to generate hyperedges, which may lead to suboptimal clustering results. Therefore, a novel sampling method called greedy neighborhood search is proposed in this work, which generates large pure hyperedges based on Shared Reverse k Nearest Neighbors (SRNN) and learns the number of hyperedges simultaneously. Experiments show the benefits of applying the proposed method on high dimensional data.

This website uses cookies

This website uses cookies