Outlier Detection with Space Transformation and Spectral Analysis

Publication: Research - peer-reviewArticle in proceedings

Documents

  • SDM2013

    Submitted manuscript, 222 KB, PDF-document

DOI

  • Xuan-Hong Dang
    Xuan-Hong DangDenmark
  • Barbora Micenková
    Barbora MicenkováDenmark
  • Ira Assent
  • Raymond T. Ng
    Raymond T. NgUniversity of British ColumbiaCanada
Detecting a small number of outliers from a set of data
observations is always challenging. In this paper, we present an
approach that exploits space transformation and uses spectral
analysis in the newly transformed space for outlier detection.
Unlike most existing techniques in the literature which rely on
notions of distances or densities, this approach introduces a
novel concept based on local quadratic entropy for evaluating the
similarity of a data object with its neighbors. This information
theoretic quantity is used to regularize the closeness amongst
data instances and subsequently benefits the process of mapping
data into a usually lower dimensional space. Outliers are then
identified by spectral analysis of the eigenspace spanned by the
set of leading eigenvectors derived from the mapping procedure.
The proposed technique is purely data-driven and imposes no
assumptions regarding the data distribution, making it
particularly suitable for identification of outliers from
irregular, non-convex shaped distributions and from data with
diverse, varying densities.
Original languageEnglish
Title of host publicationProceedings of the 2013 SIAM International Conference on Data Mining, SDM
EditorsChandrika Kamath, Jennifer Dy, Zoran Obradovic, Joydeep Ghosh, Srinivasan Parthasarathy, Zhi-Hua Zhou
Number of pages9
PublisherSociety for Industriel & Applied Mathematics
Publication yearMay 2013
Pages225-233
ISBN (print)978-1-61197-262-7
ISBN (electronic)978-1-61197-283-2
DOIs
StatePublished - May 2013
Event - Austin, Texas, United States

Conference

ConferenceSIAM International Conference on Data Mining
LandUnited States
ByAustin, Texas
Periode02/05/201304/05/2013

See relations at Aarhus University Citationformats

Download statistics

No data available

ID: 68749531