## Abstract

We study the query version of the approximate heavy hitter and quantile problems. In the former problem, the input is a parameter ε and a set P of n points in R
^{d} where each point is assigned a color from a set C, and the goal is to build a structure such that given any geometric range γ, we can efficiently find a list of approximate heavy hitters in γ ∩ P, i.e., colors that appear at least ε|γ ∩ P| times in γ ∩ P, as well as their frequencies with an additive error of ε|γ ∩ P|. In the latter problem, each point is assigned a weight from a totally ordered universe and the query must output a sequence S of 1 + 1/ε weights such that the i-th weight in S has approximate rank iε|γ ∩ P|, meaning, rank iε|γ ∩ P| up to an additive error of ε|γ ∩ P|. Previously, optimal results were only known in 1D [23] but a few sub-optimal methods were available in higher dimensions [4, 6]. We study the problems for two important classes of geometric ranges: 3D halfspace and 3D dominance queries. It is known that many other important queries can be reduced to these two, e.g., 1D interval stabbing or interval containment, 2D three-sided queries, 2D circular as well as 2D k-nearest neighbors queries. We consider the real RAM model of computation where integer registers of size w bits, w = Θ(log n), are also available. For dominance queries, we show optimal solutions for both heavy hitter and quantile problems: using linear space, we can answer both queries in time O(log n + 1/ε). Note that as the output size is
^{1}
_{ε}, after investing the initial O(log n) searching time, our structure takes on average O(1) time to find a heavy hitter or a quantile! For more general halfspace heavy hitter queries, the same optimal query time can be achieved by increasing the space by an extra log
_{w}
^{1}
_{ε} (resp. log log
_{w}
^{1}
_{ε}) factor in 3D (resp. 2D). By spending extra log
^{O}(1)
^{1}
_{ε} factors in both time and space, we can also support quantile queries. We remark that it is hopeless to achieve a similar query bound for dimensions 4 or higher unless significant advances are made in the data structure side of theory of geometric approximations.

Original language | English |
---|---|

Title of host publication | 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023) |

Editors | Kousha Etessami, Uriel Feige, Gabriele Puppis |

Publisher | Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik GmbH |

Publication date | 2023 |

Pages | 7:1-7:17 |

Article number | 7 |

ISBN (Print) | 978-3-95977-278-5 |

DOIs | |

Publication status | Published - 2023 |

Series | Leibniz International Proceedings in Informatics |
---|---|

Volume | 261 |

ISSN | 1868-8969 |

## Keywords

- Algorithms
- Computational Geometry
- Data Structures
- Geometric Approximation
- Random Sampling
- Range Searching