Why Throughput Isn't Everything: The Case of Parallelizing Skyline Queries

Activity: Talk or presentation typesLecture and oral contribution

See relations at Aarhus University

Sean Chester - Invited speaker

The extreme parallelism available in modern hardware suggests a way to combat the Big Data deluge. However, harnessing the potential parallelism can be quite challenging for many data management problems. The skyline query, which filters an input dataset to only the most salient points therein, is one such example. We see that sophisticated, single-threaded algorithms can outperform high-throughput parallel algorithms by orders-of-magnitude, even when the parallel algorithms are run on state-of-the-art graphics processing cards (GPUs) with 2680 physical cores. In this talk, I discuss how considering work-efficiency---the idea that parallel algorithms must be clever, too, even at the expense of throughput---can lead to algorithms that drastically outperform both sequential and massive-throughput competitors. The material is based on a paper we presented at ICDE 2015 (regarding multicore CPUs) and a paper that will be presented at VLDB 2015 (that focuses on the case of GPUs). At the end of the talk, I will discuss how these challenges again manifest themselves in some ongoing work on clustering natural language in social media.
10 Sep 2015

Event (Seminar)

TitleWhy Throughput Isn't Everything: The Case of Parallelizing Skyline Queries
Date10/09/2015 → …
LocationSimon Fraser University (SFU)
CityBurnaby
CountryCanada

    Keywords

  • parallelism, algorithms, skyline, work-efficiency, throughput

Projects

Publications

ID: 91018670