Research output: Contribution to book/anthology/report/proceeding › Article in proceedings › Research › peer-review

**I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis.** / Agarwal, Pankaj K.; Arge, Lars Allan; Yi, Ke.

Research output: Contribution to book/anthology/report/proceeding › Article in proceedings › Research › peer-review

Agarwal, PK, Arge, LA & Yi, K 2006, I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis. in *Proceedings of the twenty-second annual symposium on Computational geometry.* Association for Computing Machinery, pp. 167-176, Annaul Symposium on Computational Geometry. SoCG '06, Sedona, United States, 05/06/2006. https://doi.org/10.1145/1137856.1137884

Agarwal, P. K., Arge, L. A., & Yi, K. (2006). I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis. In *Proceedings of the twenty-second annual symposium on Computational geometry *(pp. 167-176). Association for Computing Machinery. https://doi.org/10.1145/1137856.1137884

Agarwal PK, Arge LA, Yi K. 2006. I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis. In Proceedings of the twenty-second annual symposium on Computational geometry. Association for Computing Machinery. pp. 167-176. https://doi.org/10.1145/1137856.1137884

Agarwal, Pankaj K., Lars Allan Arge and Ke Yi "I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis". *Proceedings of the twenty-second annual symposium on Computational geometry.* Association for Computing Machinery. 2006, 167-176. https://doi.org/10.1145/1137856.1137884

Agarwal PK, Arge LA, Yi K. I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis. In Proceedings of the twenty-second annual symposium on Computational geometry. Association for Computing Machinery. 2006. p. 167-176 https://doi.org/10.1145/1137856.1137884

Agarwal, Pankaj K. ; Arge, Lars Allan ; Yi, Ke. / **I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis**. Proceedings of the twenty-second annual symposium on Computational geometry. Association for Computing Machinery, 2006. pp. 167-176

@inproceedings{c3208820c95011df8cb9000ea68e967b,

title = "I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis",

abstract = "Despite extensive study over the last four decades and numerous applications, no I/O-efficient algorithm is known for the union-find problem. In this paper we present an I/O-efficient algorithm for the batched (off-line) version of the union-find problem. Given any sequence of N union and find operations, where each union operation joins two distinct sets, our algorithm uses O(sort(N)) = O(N/BlogM/BN/B) I/Os, where M is the memory size and B is the disk block size. This bound is asymptotically optimal in the worst case. If there are union operations that join a set with itself, our algorithm uses O(sort(N) + mst(N)) I/Os, where mst(N) is the number of I/Os needed to compute the minimum spanning tree of a graph with N edges. We also describe a simple and practical O(sort(N)log(N/M))-I/O algorithm for this problem, which we have implemented.We are interested in the union-find problem because of its applications in terrain analysis. A terrain can be abstracted as a height function defined over R2, and many problems that deal with such functions require a union-find data structure. With the emergence of modern mapping technologies, huge amount of elevation data is being generated that is too large to fit in memory, thus I/O-efficient algorithms are needed to process this data efficiently. In this paper, we study two terrain analysis problems that benefit from a union-find data structure: (i) computing topological persistence and (ii) constructing the contour tree. These structures have important applications such as terrain modeling, flow analysis, topological feature extraction, etc. We give the first O(sort(N))-I/O algorithms for these two problems, assuming that the input terrain is represented as a triangular mesh with N vertices.Finally, we report some preliminary experimental results, showing that our algorithms give order-of-magnitude improvement over previous methods on large data sets that do not fit in memory.",

author = "Agarwal, {Pankaj K.} and Arge, {Lars Allan} and Ke Yi",

year = "2006",

doi = "10.1145/1137856.1137884",

language = "English",

isbn = "1-59593-340-9",

pages = "167--176",

booktitle = "Proceedings of the twenty-second annual symposium on Computational geometry",

publisher = "Association for Computing Machinery",

note = "null ; Conference date: 05-06-2006 Through 07-06-2006",

}

TY - GEN

T1 - I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis

AU - Agarwal, Pankaj K.

AU - Arge, Lars Allan

AU - Yi, Ke

N1 - Conference code: 22

PY - 2006

Y1 - 2006

N2 - Despite extensive study over the last four decades and numerous applications, no I/O-efficient algorithm is known for the union-find problem. In this paper we present an I/O-efficient algorithm for the batched (off-line) version of the union-find problem. Given any sequence of N union and find operations, where each union operation joins two distinct sets, our algorithm uses O(sort(N)) = O(N/BlogM/BN/B) I/Os, where M is the memory size and B is the disk block size. This bound is asymptotically optimal in the worst case. If there are union operations that join a set with itself, our algorithm uses O(sort(N) + mst(N)) I/Os, where mst(N) is the number of I/Os needed to compute the minimum spanning tree of a graph with N edges. We also describe a simple and practical O(sort(N)log(N/M))-I/O algorithm for this problem, which we have implemented.We are interested in the union-find problem because of its applications in terrain analysis. A terrain can be abstracted as a height function defined over R2, and many problems that deal with such functions require a union-find data structure. With the emergence of modern mapping technologies, huge amount of elevation data is being generated that is too large to fit in memory, thus I/O-efficient algorithms are needed to process this data efficiently. In this paper, we study two terrain analysis problems that benefit from a union-find data structure: (i) computing topological persistence and (ii) constructing the contour tree. These structures have important applications such as terrain modeling, flow analysis, topological feature extraction, etc. We give the first O(sort(N))-I/O algorithms for these two problems, assuming that the input terrain is represented as a triangular mesh with N vertices.Finally, we report some preliminary experimental results, showing that our algorithms give order-of-magnitude improvement over previous methods on large data sets that do not fit in memory.

AB - Despite extensive study over the last four decades and numerous applications, no I/O-efficient algorithm is known for the union-find problem. In this paper we present an I/O-efficient algorithm for the batched (off-line) version of the union-find problem. Given any sequence of N union and find operations, where each union operation joins two distinct sets, our algorithm uses O(sort(N)) = O(N/BlogM/BN/B) I/Os, where M is the memory size and B is the disk block size. This bound is asymptotically optimal in the worst case. If there are union operations that join a set with itself, our algorithm uses O(sort(N) + mst(N)) I/Os, where mst(N) is the number of I/Os needed to compute the minimum spanning tree of a graph with N edges. We also describe a simple and practical O(sort(N)log(N/M))-I/O algorithm for this problem, which we have implemented.We are interested in the union-find problem because of its applications in terrain analysis. A terrain can be abstracted as a height function defined over R2, and many problems that deal with such functions require a union-find data structure. With the emergence of modern mapping technologies, huge amount of elevation data is being generated that is too large to fit in memory, thus I/O-efficient algorithms are needed to process this data efficiently. In this paper, we study two terrain analysis problems that benefit from a union-find data structure: (i) computing topological persistence and (ii) constructing the contour tree. These structures have important applications such as terrain modeling, flow analysis, topological feature extraction, etc. We give the first O(sort(N))-I/O algorithms for these two problems, assuming that the input terrain is represented as a triangular mesh with N vertices.Finally, we report some preliminary experimental results, showing that our algorithms give order-of-magnitude improvement over previous methods on large data sets that do not fit in memory.

U2 - 10.1145/1137856.1137884

DO - 10.1145/1137856.1137884

M3 - Article in proceedings

SN - 1-59593-340-9

SP - 167

EP - 176

BT - Proceedings of the twenty-second annual symposium on Computational geometry

PB - Association for Computing Machinery

Y2 - 5 June 2006 through 7 June 2006

ER -