TY - JOUR
T1 - Hierarchical categories in colored searching
AU - Afshani, Peyman
AU - Killmann, Rasmus
AU - Larsen, Kasper Green
PY - 2024/8
Y1 - 2024/8
N2 - In colored range counting (CRC), the input is a set of points where each point is assigned a “color” (or a “category”) and the goal is to store them in a data structure such that the number of distinct categories inside a given query range can be counted efficiently. CRC has strong motivations as it allows data structure to deal with categorical data. However, colors (i.e., the categories) in the CRC problem do not have any internal structure, whereas this is not the case for many datasets in practice where hierarchical categories exist or where a single input belongs to multiple categories. Motivated by these, we consider variants of the problem where such structures can be represented. We define two variants of the problem called hierarchical range counting (HCC) and sub-category colored range counting (SCRC) and consider hierarchical structures that can either be a DAG or a tree. We show that the two problems on some special trees are in fact equivalent to other well-known problems in the literature. Based on these, we also give efficient data structures when the underlying hierarchy can be represented as a tree. We show a conditional lower bound for the general case when the existing hierarchy can be any DAG, through a reduction from the orthogonal vectors problem.
AB - In colored range counting (CRC), the input is a set of points where each point is assigned a “color” (or a “category”) and the goal is to store them in a data structure such that the number of distinct categories inside a given query range can be counted efficiently. CRC has strong motivations as it allows data structure to deal with categorical data. However, colors (i.e., the categories) in the CRC problem do not have any internal structure, whereas this is not the case for many datasets in practice where hierarchical categories exist or where a single input belongs to multiple categories. Motivated by these, we consider variants of the problem where such structures can be represented. We define two variants of the problem called hierarchical range counting (HCC) and sub-category colored range counting (SCRC) and consider hierarchical structures that can either be a DAG or a tree. We show that the two problems on some special trees are in fact equivalent to other well-known problems in the literature. Based on these, we also give efficient data structures when the underlying hierarchy can be represented as a tree. We show a conditional lower bound for the general case when the existing hierarchy can be any DAG, through a reduction from the orthogonal vectors problem.
KW - Computational geometry
KW - Data structures
KW - Range searching
UR - http://www.scopus.com/inward/record.url?scp=85188432604&partnerID=8YFLogxK
U2 - 10.1016/j.comgeo.2024.102090
DO - 10.1016/j.comgeo.2024.102090
M3 - Journal article
SN - 0925-7721
VL - 121
JO - Computational Geometry
JF - Computational Geometry
M1 - 102090
ER -