Detecting Complex Sensitive Information via Phrase Structure in Recursive Neural Networks

Jan Neerbek, Ira Assent, Peter Dolog

Research output: Contribution to book/anthology/report/proceedingArticle in proceedingsResearchpeer-review

Abstract

State-of-the-art sensitive information detection in unstructured data relies on the frequency of co-occurrence of keywords with sensitive seed words. In practice, however, this may fail to detect more complex patterns of sensitive information. In this work, we propose learning phrase structures that separate sensitive from non-sensitive documents in recursive neural networks. Our evaluation on real data with human labeled sensitive content shows that our new approach outperforms existing keyword based strategies.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 22nd Pacific-Asia Conference, PAKDD 2018, Proceedings : PAKDD '18
EditorsDinh Phung, Vincent S. Tseng, Geoffrey I. Webb, Bao Ho, Mohadeseh Ganji, Lida Rashidi
Number of pages12
Volume10939
PublisherSpringer VS
Publication date2018
Pages373-385
ISBN (Print)978-3-319-93039-8
ISBN (Electronic)978-3-319-93040-4
DOIs
Publication statusPublished - 2018
EventPacific-Asia Conference on Knowledge Discovery and Data Mining - Melbourne, Australia
Duration: 3 Jun 20186 Jun 2018
Conference number: 22
http://prada-research.net/pakdd18

Conference

ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining
Number22
Country/TerritoryAustralia
CityMelbourne
Period03/06/201806/06/2018
Internet address
SeriesLecture Notes in Computer Science (LNCS)
Number10939
ISSN0302-9743

Keywords

  • Data Leak Prevention
  • Natural Text Understanding
  • Recursive Neural Networks
  • Sensitive Information

Fingerprint

Dive into the research topics of 'Detecting Complex Sensitive Information via Phrase Structure in Recursive Neural Networks'. Together they form a unique fingerprint.

Cite this