GPU-FAST-PROCLUS: A Fast GPU-parallelized Approach to Projected Clustering

Jakob Rødsgaard Jørgensen, Katrine Scheel Nellemann, Ira Assent, Ajeet Ram Pathak, Anne C. Elster

Research output: Contribution to conferencePaperResearchpeer-review

Abstract

Projected and subspace clustering aim to find groups of similar objects within a subspace of the full-dimensional space. Where subspace clustering tries to identify clusters in all possible subspaces, projected clustering assigns each point to a single cluster within one projected subspace, resulting in a much smaller result set. PROCLUS is an adaptation of the k-medoids clustering algorithm, CLARANS, to projected clustering. Even though PROCLUS is the first projected clustering algorithm, it is still competitive in comparative empirical studies.

PROCLUS is, however, still too slow for large-scale data or real-time interaction when used in information retrieval processes. Therefore, we propose novel algorithmic strategies to reduce computations and exploit the massive parallelism offered by modern graphical processing units (GPUs). To take advantage of their high degree of parallelism, standard sequential algorithms need to be significantly restructured. We therefore also propose a novel GPU-parallelized algorithm, GPU-FAST-PROCLUS, that takes advantage of the computational power of modern GPUs.

We provide experimental studies that demonstrate the benefit of our proposed strategies and GPU-parallelizations. In this experimental evaluation, we obtain 3 orders of magnitude speedup compared to PROCLUS.
Original languageEnglish
Publication dateMar 2022
Publication statusPublished - Mar 2022
EventEDBT 2022: 24th International Conference on Extending Database Technology -
Duration: 29 Mar 20221 Apr 2022

Conference

ConferenceEDBT 2022: 24th International Conference on Extending Database Technology
Period29/03/202201/04/2022

Fingerprint

Dive into the research topics of 'GPU-FAST-PROCLUS: A Fast GPU-parallelized Approach to Projected Clustering'. Together they form a unique fingerprint.

Cite this