Aarhus University Seal / Aarhus Universitets segl

Seminar: Handling Skylines with Large Outputs

Activity: Talk or presentation typesLecture and oral contribution

See relations at Aarhus University

Sean Chester - Invited speaker

The skyline operator is a well-established tool for reducing an input dataset to points that are “most interesting.” It returns exactly those that have larger values in at least one dimension when compared to any other (distinct) data point in the dataset. Practically, these are the data points that maximize somebody’s preferences. However, the size of the skyline is bounded only by the input size, which has two immediate consequences. First, since the execution time for the skyline operator is primarily dependent on the output size, performance suffers. Second, no user wants the entire input returned as the output.

In this talk, I address these two concerns. By breaking from the traditional share-nothing paradigm for parallelizing skyline queries, we can introduce a multicore algorithm that is the first to consistently outperform state-of-the-art sequential computation. Next, I introduce a new means of selecting a small, fixed-size subset of the skyline that minimizes the “regret” that a top-k user would suffer if using the subset rather than the full skyline. To this end, I show intractability, efficient geometric techniques for two dimensions, and effective randomized techniques for the NP-hard general case. Finally, I conclude with some interesting open problems that have arisen from this work.

Invited by Aamir Cheema of Monash University, Melbourne.
10 Sep 2014

External organisation

NameUnknown external organisation

ID: 81384975