Data Exploration using Example-based Methods

Research output: Book/anthology/dissertation/reportBookEducationpeer-review

  • Matteo Lissandrini, Department of Computer Science, Aalborg University, Denmark
  • Davide Mottin
  • Themis Palpanas, University Paris Descartes, Paris, France
  • Yannis Velegrakis, University of Trento, Italy
Data usually comes in a plethora of formats and dimensions, rendering the information extraction and exploration processes challenging. Thus, being able to perform exploratory analyses of the data with the intent of having an immediate glimpse of some of the data properties is becoming crucial. Exploratory analyses should be simple enough to avoid complicated declarative languages (such as SQL) and mechanisms, while at the same time retaining the flexibility and expressiveness of such languages. Recently, we have witnessed a rediscovery of the so-called example-based methods, in which the user, or analyst, circumvents query languages by using examples as input. An example is a representative of the intended results or, in other words, an item from the result set. Example-based methods exploit inherent characteristics of the data to infer the results that the user has in mind but may not be able to (easily) express. They can be useful in cases where a user is looking for information in an unfamiliar dataset, when they are performing a particularly challenging task like finding duplicate items, or when they are simply exploring the data. In this book, we present an excursus over the main methods for exploratory analysis, with a particular focus on example-based methods. We show how different data types require different techniques and present algorithms that are specifically designed for relational, textual, and graph data. The book also presents the challenges and new frontiers of machine learning in online settings that have recently attracted the attention of the database community. The book concludes with a vision for further research and applications in this area.
Original languageEnglish
PublisherMorgan & Claypool Publishers
Volume10/4
Number of pages164
ISBN (Print)9781681734576
ISBN (Electronic)9781681734552
DOIs
Publication statusPublished - 1 Dec 2018
SeriesSynthesis Lectures on Data Management
ISSN2153-5418

    Research areas

  • Data mining, database, example-based

See relations at Aarhus University Citationformats

ID: 137680643