Abstract
The Hungarian algorithm is a popular solution for the linear assignment problem that finds correspondences between sets of items. Despite its popularity, this algorithm suffers from significant efficiency shortcomings which hinders its application to large datasets. To overcome this challenge, we design a parallel version of the algorithm for novel tile-centric accelerators which consist of thousands of small processing cores equipped with memory that is local to each core. This allows them to overcome the memory latency and bandwidth limits of CPUs and GPUs, but, in turn, data and work must be carefully distributed among the many cores. We select the tile-centric Intelligence Processing Unit and discuss its capabilities and the challenges for algorithm design. We then present HunIPU, which outperforms the best GPU algorithm running 6× faster on synthetic datasets and up to 32× on real-world datasets for graph alignment. The code is available at https://zenodo.org/records/13832045.
| Originalsprog | Engelsk |
|---|---|
| Titel | SIAM Symposium on Algorithm Engineering and Experiments, ALENEX 2025 |
| Antal sider | 13 |
| Forlag | Society for Industrial and Applied Mathematics Publications |
| Publikationsdato | 2025 |
| Sider | 107-119 |
| ISBN (Elektronisk) | 9798331311995 |
| Status | Udgivet - 2025 |
| Begivenhed | 2025 SIAM Symposium on Algorithm Engineering and Experiments, ALENEX 2025 - New Orleans, USA Varighed: 12 jan. 2025 → 13 jan. 2025 |
Konference
| Konference | 2025 SIAM Symposium on Algorithm Engineering and Experiments, ALENEX 2025 |
|---|---|
| Land/Område | USA |
| By | New Orleans |
| Periode | 12/01/2025 → 13/01/2025 |