Connecting the Dots: Density-Connectivity Distance unifies DBSCAN, k-Center and Spectral Clustering

Anna Beer, Andrew Draganov, Ellen Hohma, Philipp Jahn, Christian M.M. Frey, Ira Assent

Research output: Contribution to book/anthology/report/proceedingArticle in proceedingsResearchpeer-review

Abstract

Despite the popularity of density-based clustering, its procedural definition makes it difficult to analyze compared to clustering methods that minimize a loss function. In this paper, we reformulate DBSCAN through a clean objective function by introducing the density-connectivity distance (dc-dist), which captures the essence of density-based clusters by endowing the minimax distance with the concept of density. This novel ultrametric allows us to show that DBSCAN, k-center, and spectral clustering are equivalent in the space given by the dc-dist, despite these algorithms being perceived as fundamentally different in their respective literatures. We also verify that finding the pairwise dc-dists gives DBSCAN clusterings across all epsilon-values, simplifying the problem of parameterizing density-based clustering. We conclude by thoroughly analyzing density-connectivity and its properties - a task that has been elusive thus far in the literature due to the lack of formal tools. Our code recreates every experiment below: https://github.com/Andrew-Draganov/dc-dist

Original languageEnglish
Title of host publicationKDD 2023 : Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Number of pages13
Place of publicationNew York
PublisherAssociation for Computing Machinery
Publication dateAug 2023
Pages80-92
ISBN (Electronic)9798400701030
DOIs
Publication statusPublished - Aug 2023
Event29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023 - Long Beach, United States
Duration: 6 Aug 202310 Aug 2023

Conference

Conference29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023
Country/TerritoryUnited States
CityLong Beach
Period06/08/202310/08/2023
SponsorACM SIGKDD, ACM SIGMOD
SeriesProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Keywords

  • clustering
  • density
  • minimax path
  • ultrametric space

Fingerprint

Dive into the research topics of 'Connecting the Dots: Density-Connectivity Distance unifies DBSCAN, k-Center and Spectral Clustering'. Together they form a unique fingerprint.

Cite this