INSCY: indexing subspace clusters with in-process-removal of redundancy

Research output: Contribution to book/anthology/report/proceedingArticle in proceedingsResearchpeer-review

  • Ira Assent
  • Ralph Krieger, RWTH Aachen University, Germany
  • Emmanuel Müller, RWTH Aachen University, Germany
  • Emmanuel ller
  • ,
  • Thomas Seidl, RWTH Aachen University, Germany
Clustering is an established data mining technique for grouping objects based on mutual similarity. In high-dimensional spaces, clusters are typically hidden in the scatter of irrelevant attributes. To detect these hidden clusters, subspace clustering focuses on relevant attribute projections for each individual cluster. As the number of possible projections is exponential in the number of dimensions, efficiency is crucial for these high-dimensional settings. Moreover, the resulting subspace clusters are often highly redundant, i.e. many clusters are detected multiply in several projections. Containing essentially the same information, redundant subspace clusters have to be removed to allow users to review the entire output. In addition, removal of low-dimensional redundancy actually improves quality. In this work we propose a novel index structure for efficient subspace clustering with in-process-removal of redundant clusters. Unlike existing breadth-first approaches, INSCY (INdexing Subspace Clusters without redundancY) proceeds depth-first on the dimensionality of the subspaces. Our depth-first mining with index support allows immediate pruning of redundant subspace clusters and thus greatly reduces the computational cost of subspace clustering. Thorough experiments on real and synthetic data show that INSCY yields substantial efficiency and quality improvements over existing subspace clustering approaches.
Original languageEnglish
Title of host publicationEighth IEEE International Conference on Data Mining, 2008. ICDM '08.
Number of pages6
Publication year2008
Pages719-724
ISBN (Electronic)978-0-7695-3502-9
DOIs
Publication statusPublished - 2008
Externally publishedYes

See relations at Aarhus University Citationformats

ID: 47659721