Design and Evaluation of a Parallel Execution Framework for the CLEVER Clustering Algorithm

Chen, Chung Sheng; Shaikh, Nauful; Charoenrattanaruk, Panitee; Eick, Christoph F.; Rizk, Nouhad; Gabriel, Edgar

doi:10.3233/978-1-61499-041-3-73

Abstract

Data mining is used to extract valuable knowledge from vast pools of data. Due to the computational complexity of the algorithms applied and the problems of handling large data sets themselves, data mining applications often require days to perform their analysis when dealing with large data sets. This paper presents the design and evaluation of a parallel computation framework for CLEVER, a prototype-based clustering algorithm which has been successfully used for a wide range of application scenarios. The algorithm supports plug-in fitness functions and employs randomized hill climbing to maximize a given fitness function. We explore various parallelization strategies using OpenMP and CUDA, and evaluate the performance of the parallel algorithms for three different data sets. Our results indicate a very good scalability of the parallel algorithm using multi-core processors, reducing the execution time and allowing to solve problems which were considered not feasible with the sequential version of CLEVER.

This website uses cookies

This website uses cookies