A Dichotomy for Regular Expression Membership Testing

K. Bringmann, A. Grønlund, K. G. Larsen

Research output: Contribution to book/anthology/report/proceedingArticle in proceedingsResearchpeer-review

40 Citations (Scopus)

Abstract

We study regular expression membership testing: Given a regular expression of size m and a string of size n, decide whether the string is in the language described by the regular expression. Its classic O(nm) algorithm is one of the big success stories of the 70s, which allowed pattern matching to develop into the standard tool that it is today.Many special cases of pattern matching have been studied that can be solved faster than in quadratic time. However, a systematic study of tractable cases was made possible only recently, with the first conditional lower bounds reported by Backurs and Indyk [FOCS16]. Restricted to any type of homogeneous regular expressions of depth 2 or 3, they either presented a near-linear time algorithm or a quadratic conditional lower bound, with one exception known as the Word Break problem.In this paper we complete their work as follows:• We present two almost-linear time algorithms that generalize all known almost-linear time algorithms for special cases of regular expression membership testing.• We classify all types, except for the Word Break problem, into almost-linear time or quadratic time assuming the Strong Exponential Time Hypothesis. This extends the classification from depth 2 and 3 to any constant depth.• For the Word Break problem we give an improved O(nm1/3 + m) algorithm. Surprisingly, we also prove a matching conditional lower bound for combinatorial algorithms. This establishes Word Break as the only intermediate problem.In total, we prove matching upper and lower bounds for any type of bounded-depth homogeneous regular expressions, which yields a full dichotomy for regular expression member-ship testing.

Original languageEnglish
Title of host publicationProceedings - 58th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2017
Number of pages12
PublisherIEEE Computer Society Press
Publication date10 Nov 2017
Pages307-318
Article number8104068
ISBN (Print)978-1-5386-3465-3
ISBN (Electronic)978-1-5386-3464-6
DOIs
Publication statusPublished - 10 Nov 2017
Event58th Annual IEEE Symposium on Foundations of Computer Science - Berkeley, United States
Duration: 15 Oct 201717 Oct 2017
Conference number: 58
http://ieee-focs.org/FOCS-2017-Papers/

Other

Other58th Annual IEEE Symposium on Foundations of Computer Science
Number58
Country/TerritoryUnited States
CityBerkeley
Period15/10/201717/10/2017
Internet address

Fingerprint

Dive into the research topics of 'A Dichotomy for Regular Expression Membership Testing'. Together they form a unique fingerprint.

Cite this