TY - GEN
T1 - On Range Summary Queries
AU - Afshani, Peyman
AU - Cheng, Pingan
AU - Basu Roy, Aniket
AU - Wei, Zhewei
PY - 2023
Y1 - 2023
N2 - We study the query version of the approximate heavy hitter and quantile problems. In the former problem, the input is a parameter ε and a set P of n points in R
d where each point is assigned a color from a set C, and the goal is to build a structure such that given any geometric range γ, we can efficiently find a list of approximate heavy hitters in γ ∩ P, i.e., colors that appear at least ε|γ ∩ P| times in γ ∩ P, as well as their frequencies with an additive error of ε|γ ∩ P|. In the latter problem, each point is assigned a weight from a totally ordered universe and the query must output a sequence S of 1 + 1/ε weights such that the i-th weight in S has approximate rank iε|γ ∩ P|, meaning, rank iε|γ ∩ P| up to an additive error of ε|γ ∩ P|. Previously, optimal results were only known in 1D [23] but a few sub-optimal methods were available in higher dimensions [4, 6]. We study the problems for two important classes of geometric ranges: 3D halfspace and 3D dominance queries. It is known that many other important queries can be reduced to these two, e.g., 1D interval stabbing or interval containment, 2D three-sided queries, 2D circular as well as 2D k-nearest neighbors queries. We consider the real RAM model of computation where integer registers of size w bits, w = Θ(log n), are also available. For dominance queries, we show optimal solutions for both heavy hitter and quantile problems: using linear space, we can answer both queries in time O(log n + 1/ε). Note that as the output size is
1
ε, after investing the initial O(log n) searching time, our structure takes on average O(1) time to find a heavy hitter or a quantile! For more general halfspace heavy hitter queries, the same optimal query time can be achieved by increasing the space by an extra log
w
1
ε (resp. log log
w
1
ε) factor in 3D (resp. 2D). By spending extra log
O(1)
1
ε factors in both time and space, we can also support quantile queries. We remark that it is hopeless to achieve a similar query bound for dimensions 4 or higher unless significant advances are made in the data structure side of theory of geometric approximations.
AB - We study the query version of the approximate heavy hitter and quantile problems. In the former problem, the input is a parameter ε and a set P of n points in R
d where each point is assigned a color from a set C, and the goal is to build a structure such that given any geometric range γ, we can efficiently find a list of approximate heavy hitters in γ ∩ P, i.e., colors that appear at least ε|γ ∩ P| times in γ ∩ P, as well as their frequencies with an additive error of ε|γ ∩ P|. In the latter problem, each point is assigned a weight from a totally ordered universe and the query must output a sequence S of 1 + 1/ε weights such that the i-th weight in S has approximate rank iε|γ ∩ P|, meaning, rank iε|γ ∩ P| up to an additive error of ε|γ ∩ P|. Previously, optimal results were only known in 1D [23] but a few sub-optimal methods were available in higher dimensions [4, 6]. We study the problems for two important classes of geometric ranges: 3D halfspace and 3D dominance queries. It is known that many other important queries can be reduced to these two, e.g., 1D interval stabbing or interval containment, 2D three-sided queries, 2D circular as well as 2D k-nearest neighbors queries. We consider the real RAM model of computation where integer registers of size w bits, w = Θ(log n), are also available. For dominance queries, we show optimal solutions for both heavy hitter and quantile problems: using linear space, we can answer both queries in time O(log n + 1/ε). Note that as the output size is
1
ε, after investing the initial O(log n) searching time, our structure takes on average O(1) time to find a heavy hitter or a quantile! For more general halfspace heavy hitter queries, the same optimal query time can be achieved by increasing the space by an extra log
w
1
ε (resp. log log
w
1
ε) factor in 3D (resp. 2D). By spending extra log
O(1)
1
ε factors in both time and space, we can also support quantile queries. We remark that it is hopeless to achieve a similar query bound for dimensions 4 or higher unless significant advances are made in the data structure side of theory of geometric approximations.
KW - Algorithms
KW - Computational Geometry
KW - Data Structures
KW - Geometric Approximation
KW - Random Sampling
KW - Range Searching
UR - http://www.scopus.com/inward/record.url?scp=85167364004&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.ICALP.2023.7
DO - 10.4230/LIPIcs.ICALP.2023.7
M3 - Article in proceedings
SN - 978-3-95977-278-5
T3 - Leibniz International Proceedings in Informatics
SP - 7:1-7:17
BT - 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)
A2 - Etessami, Kousha
A2 - Feige, Uriel
A2 - Puppis, Gabriele
PB - Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik GmbH
ER -