Aarhus Universitets segl

Segmented Merge: A New Primitive for Parallel Sparse Matrix Computations

Publikation: Bidrag til tidsskrift/Konferencebidrag i tidsskrift /Bidrag til avisTidsskriftartikelForskningpeer review

DOI

  • Haonan Ji, China University of Petroleum - Beijing
  • ,
  • Shibo Lu, China University of Petroleum - Beijing
  • ,
  • Kaixi Hou, Virginia Polytechnic Institute and State University
  • ,
  • Hao Wang, Ohio State University
  • ,
  • Zhou Jin, China University of Petroleum - Beijing
  • ,
  • Weifeng Liu, China University of Petroleum - Beijing
  • ,
  • Brian Vinter

Segmented operations, such as segmented sum, segmented scan and segmented sort, are important building blocks for parallel irregular algorithms. We in this work propose a new parallel primitive called segmented merge. Its function is in parallel merging q sub-segments to p segments, both of possibly nonuniform lengths which easily cause the load balancing and the vectorization problems on massively parallel processors, such as GPUs. Our algorithm resolves these problems by first recording the boundaries of segments and sub-segments, then assigning roughly the same number of elements for GPU threads, and finally iteratively merging the sub-segments in each segment in the form of binary tree until there is only one sub-segment in each segment. We implement the segmented merge primitive on GPUs and demonstrate its efficiency on parallel sparse matrix transposition (SpTRANS) and sparse matrix–matrix multiplication (SpGEMM) operations. We conduct a comparative experiment with NVIDIA vendor library on two GPUs. The experimental results show that our algorithm achieve on average 3.94× (up to 13.09×) and 2.89× (up to 109.15×) speedup on SpTRANS and SpGEMM, respectively.

OriginalsprogEngelsk
TidsskriftInternational Journal of Parallel Programming
Vol/bind49
Nummer5
Sider (fra-til)732-744
Antal sider13
ISSN0885-7458
DOI
StatusUdgivet - okt. 2021

Bibliografisk note

Funding Information:
We would like to thank the invaluable comments from all the reviewers. Zhou Jin is the corresponding author of this paper. This research was supported by the Science Challenge Project under Grant No. TZZT2016002, the National Natural Science Foundation of China under Grant No. 61972415, and the Science Foundation of China University of Petroleum, Beijing under Grant Nos. 2462019YJRC004, 2462020XKJS03.

Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.

Se relationer på Aarhus Universitet Citationsformater

ID: 222522628