I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis

Research output: Contribution to book/anthology/report/proceedingArticle in proceedingsResearchpeer-review

  • Department of Computer Science
Despite extensive study over the last four decades and numerous applications, no I/O-efficient algorithm is known for the union-find problem. In this paper we present an I/O-efficient algorithm for the batched (off-line) version of the union-find problem. Given any sequence of N union and find operations, where each union operation joins two distinct sets, our algorithm uses O(sort(N)) = O(N/BlogM/BN/B) I/Os, where M is the memory size and B is the disk block size. This bound is asymptotically optimal in the worst case. If there are union operations that join a set with itself, our algorithm uses O(sort(N) + mst(N)) I/Os, where mst(N) is the number of I/Os needed to compute the minimum spanning tree of a graph with N edges. We also describe a simple and practical O(sort(N)log(N/M))-I/O algorithm for this problem, which we have implemented.We are interested in the union-find problem because of its applications in terrain analysis. A terrain can be abstracted as a height function defined over R2, and many problems that deal with such functions require a union-find data structure. With the emergence of modern mapping technologies, huge amount of elevation data is being generated that is too large to fit in memory, thus I/O-efficient algorithms are needed to process this data efficiently. In this paper, we study two terrain analysis problems that benefit from a union-find data structure: (i) computing topological persistence and (ii) constructing the contour tree. These structures have important applications such as terrain modeling, flow analysis, topological feature extraction, etc. We give the first O(sort(N))-I/O algorithms for these two problems, assuming that the input terrain is represented as a triangular mesh with N vertices.Finally, we report some preliminary experimental results, showing that our algorithms give order-of-magnitude improvement over previous methods on large data sets that do not fit in memory.
Original languageEnglish
Title of host publicationProceedings of the twenty-second annual symposium on Computational geometry
Number of pages9
PublisherAssociation for Computing Machinery
Publication year2006
Pages167-176
ISBN (print)1-59593-340-9
DOIs
Publication statusPublished - 2006
EventAnnaul Symposium on Computational Geometry. SoCG '06 - Sedona, United States
Duration: 5 Jun 20067 Jun 2006
Conference number: 22

Conference

ConferenceAnnaul Symposium on Computational Geometry. SoCG '06
Nummer22
LandUnited States
BySedona
Periode05/06/200607/06/2006

See relations at Aarhus University Citationformats

ID: 21992878