TY - JOUR
T1 - SkyAlign: a portable, work-efficient skyline algorithm for multicore and GPU architectures
AU - Bøgh, Kenneth Sejdenfaden
AU - Chester, Sean
AU - Assent, Ira
PY - 2016/12
Y1 - 2016/12
N2 - The skyline operator determines points in a multidimensional dataset that offer some optimal trade-off. State-of-the-art CPU skyline algorithms exploit quad-tree partitioning with complex branching to minimise the number of point-to-point comparisons. Branch-phobic GPU skyline algorithms rely on compute throughput rather than partitioning, but fail to match the performance of sequential algorithms. In this paper, we introduce a new skyline algorithm, SkyAlign, that is designed for the GPU, and a GPU-friendly, grid-based tree structure upon which the algorithm relies. The search tree allows us to dramatically reduce the amount of work done by the GPU algorithm by avoiding most point-to-point comparisons at the cost of some compute throughput. This trade-off allows SkyAlign to achieve orders of magnitude faster performance than its predecessors. Moreover, a NUMA-oblivious port of SkyAlign outperforms native multicore state of the art on challenging workloads by an increasing margin as more cores and sockets are utilised.
AB - The skyline operator determines points in a multidimensional dataset that offer some optimal trade-off. State-of-the-art CPU skyline algorithms exploit quad-tree partitioning with complex branching to minimise the number of point-to-point comparisons. Branch-phobic GPU skyline algorithms rely on compute throughput rather than partitioning, but fail to match the performance of sequential algorithms. In this paper, we introduce a new skyline algorithm, SkyAlign, that is designed for the GPU, and a GPU-friendly, grid-based tree structure upon which the algorithm relies. The search tree allows us to dramatically reduce the amount of work done by the GPU algorithm by avoiding most point-to-point comparisons at the cost of some compute throughput. This trade-off allows SkyAlign to achieve orders of magnitude faster performance than its predecessors. Moreover, a NUMA-oblivious port of SkyAlign outperforms native multicore state of the art on challenging workloads by an increasing margin as more cores and sockets are utilised.
U2 - 10.1007/s00778-016-0438-1
DO - 10.1007/s00778-016-0438-1
M3 - Journal article
SN - 1066-8888
VL - 25
SP - 817
EP - 841
JO - The VLDB Journal
JF - The VLDB Journal
IS - 6
ER -