Research output: Contribution to book/anthology/report/proceeding › Article in proceedings › Research › peer-review

**Cross-Referenced dictionaries and the limits of write optimization.** / Afshani, Peyman; Bender, Michael A.; Farach-Colton, Martin; Fineman, Jeremy T.; Goswami, Mayank; Tsai, Meng Tsung.

Research output: Contribution to book/anthology/report/proceeding › Article in proceedings › Research › peer-review

Afshani, P, Bender, MA, Farach-Colton, M, Fineman, JT, Goswami, M & Tsai, MT 2017, Cross-Referenced dictionaries and the limits of write optimization. in PN Klein (ed.), *28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017.* Association for Computing Machinery, pp. 1523-1532, 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, Barcelona, Spain, 16/01/2017.

Afshani, P., Bender, M. A., Farach-Colton, M., Fineman, J. T., Goswami, M., & Tsai, M. T. (2017). Cross-Referenced dictionaries and the limits of write optimization. In P. N. Klein (Ed.), *28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017 *(pp. 1523-1532). Association for Computing Machinery.

Afshani P, Bender MA, Farach-Colton M, Fineman JT, Goswami M, Tsai MT. 2017. Cross-Referenced dictionaries and the limits of write optimization. Klein PN, editor. In 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017. Association for Computing Machinery. pp. 1523-1532.

Afshani, Peyman et al. "Cross-Referenced dictionaries and the limits of write optimization". Klein , P.N. (ed.). *28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017.* Association for Computing Machinery. 2017, 1523-1532.

Afshani P, Bender MA, Farach-Colton M, Fineman JT, Goswami M, Tsai MT. Cross-Referenced dictionaries and the limits of write optimization. In Klein PN, editor, 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017. Association for Computing Machinery. 2017. p. 1523-1532

Afshani, Peyman ; Bender, Michael A. ; Farach-Colton, Martin ; Fineman, Jeremy T. ; Goswami, Mayank ; Tsai, Meng Tsung. / **Cross-Referenced dictionaries and the limits of write optimization**. 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017. editor / P.N. Klein . Association for Computing Machinery, 2017. pp. 1523-1532

@inproceedings{73c9ca02474a454691051bca4b038a26,

title = "Cross-Referenced dictionaries and the limits of write optimization",

abstract = "Dictionaries remain the most well studied class of data structures. A dictionary supports insertions, deletions, membership queries, and usually successor, predecessor, and extract-min. In a RAM, all such operations take O(logN) time on N elements. Dictionaries are often cross-referenced as follows. Con- sider a set of tuples fhai; bi; ci : : :ig. A database might in- clude more than one dictionary on such a set, for example, one indexed on the a's, another on the b's, and so on. Once again, in a RAM, inserting into a set of L cross-referenced dictionaries takes O(LlogN) time, as does deleting. The situation is more interesting in external memory. On a Disk Access Machine (DAM), B-trees achieve O(logB N) I/Os for insertions and deletions on a single dictionary and K-element range queries take optimal O(logB N + K=B) I/Os. These bounds are also achievable by a B-tree on cross-referenced dictionaries, with a slowdown of an L factor on insertion and deletions. In recent years, both the theory and practice of external- memory dictionaries has been revolutionized by write- optimization techniques. A dictionary is write optimized if it is close to a B-tree for query time while beating B- trees on insertions. The best (and optimal) dictionaries achieve a substantially improved insertion and deletion cost of, amortized I/Os on a single dictionary while maintaining optimal O(log1+B{"} N +K=B)- I/O range queries. Although write optimization still helps for insertions into cross-referenced dictionaries, its value for deletions would seem to be greatly reduced. A deletion into a cross- referenced dictionary only specifies a key a. It seems to be necessary to look up the associated values b; c : : : in order to delete them from the other dictionaries. This takes (logB N) I/Os, well above the per-dictionary write- optimization budget of O( So the total deletion cost is O(logB N + L In short, for deletions, write optimization offers an ad- vantage over B-trees in that L multiplies a lower order term, but when L = 2, write optimization seems to offer no asymp- totic advantage over B-trees. That is, no known query- optimal solution for pairs of cross-referenced dictionaries seem to beat B-trees for deletions. In this paper, we show a lower bound establishing that a pair of cross-referenced dictionaries that are optimal for range queries and that supports deletions cannot match the write optimization bound available to insert-only dictionar- ies. This result thus establishes a limit to the applicability of write-optimization techniques on which many new databases and file systems are based.",

author = "Peyman Afshani and Bender, {Michael A.} and Martin Farach-Colton and Fineman, {Jeremy T.} and Mayank Goswami and Tsai, {Meng Tsung}",

year = "2017",

month = jan,

day = "16",

language = "English",

pages = "1523--1532",

editor = "{Klein }, P.N.",

booktitle = "28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017",

publisher = "Association for Computing Machinery",

note = "28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017 ; Conference date: 16-01-2017 Through 19-01-2017",

}

TY - GEN

T1 - Cross-Referenced dictionaries and the limits of write optimization

AU - Afshani, Peyman

AU - Bender, Michael A.

AU - Farach-Colton, Martin

AU - Fineman, Jeremy T.

AU - Goswami, Mayank

AU - Tsai, Meng Tsung

PY - 2017/1/16

Y1 - 2017/1/16

N2 - Dictionaries remain the most well studied class of data structures. A dictionary supports insertions, deletions, membership queries, and usually successor, predecessor, and extract-min. In a RAM, all such operations take O(logN) time on N elements. Dictionaries are often cross-referenced as follows. Con- sider a set of tuples fhai; bi; ci : : :ig. A database might in- clude more than one dictionary on such a set, for example, one indexed on the a's, another on the b's, and so on. Once again, in a RAM, inserting into a set of L cross-referenced dictionaries takes O(LlogN) time, as does deleting. The situation is more interesting in external memory. On a Disk Access Machine (DAM), B-trees achieve O(logB N) I/Os for insertions and deletions on a single dictionary and K-element range queries take optimal O(logB N + K=B) I/Os. These bounds are also achievable by a B-tree on cross-referenced dictionaries, with a slowdown of an L factor on insertion and deletions. In recent years, both the theory and practice of external- memory dictionaries has been revolutionized by write- optimization techniques. A dictionary is write optimized if it is close to a B-tree for query time while beating B- trees on insertions. The best (and optimal) dictionaries achieve a substantially improved insertion and deletion cost of, amortized I/Os on a single dictionary while maintaining optimal O(log1+B" N +K=B)- I/O range queries. Although write optimization still helps for insertions into cross-referenced dictionaries, its value for deletions would seem to be greatly reduced. A deletion into a cross- referenced dictionary only specifies a key a. It seems to be necessary to look up the associated values b; c : : : in order to delete them from the other dictionaries. This takes (logB N) I/Os, well above the per-dictionary write- optimization budget of O( So the total deletion cost is O(logB N + L In short, for deletions, write optimization offers an ad- vantage over B-trees in that L multiplies a lower order term, but when L = 2, write optimization seems to offer no asymp- totic advantage over B-trees. That is, no known query- optimal solution for pairs of cross-referenced dictionaries seem to beat B-trees for deletions. In this paper, we show a lower bound establishing that a pair of cross-referenced dictionaries that are optimal for range queries and that supports deletions cannot match the write optimization bound available to insert-only dictionar- ies. This result thus establishes a limit to the applicability of write-optimization techniques on which many new databases and file systems are based.

AB - Dictionaries remain the most well studied class of data structures. A dictionary supports insertions, deletions, membership queries, and usually successor, predecessor, and extract-min. In a RAM, all such operations take O(logN) time on N elements. Dictionaries are often cross-referenced as follows. Con- sider a set of tuples fhai; bi; ci : : :ig. A database might in- clude more than one dictionary on such a set, for example, one indexed on the a's, another on the b's, and so on. Once again, in a RAM, inserting into a set of L cross-referenced dictionaries takes O(LlogN) time, as does deleting. The situation is more interesting in external memory. On a Disk Access Machine (DAM), B-trees achieve O(logB N) I/Os for insertions and deletions on a single dictionary and K-element range queries take optimal O(logB N + K=B) I/Os. These bounds are also achievable by a B-tree on cross-referenced dictionaries, with a slowdown of an L factor on insertion and deletions. In recent years, both the theory and practice of external- memory dictionaries has been revolutionized by write- optimization techniques. A dictionary is write optimized if it is close to a B-tree for query time while beating B- trees on insertions. The best (and optimal) dictionaries achieve a substantially improved insertion and deletion cost of, amortized I/Os on a single dictionary while maintaining optimal O(log1+B" N +K=B)- I/O range queries. Although write optimization still helps for insertions into cross-referenced dictionaries, its value for deletions would seem to be greatly reduced. A deletion into a cross- referenced dictionary only specifies a key a. It seems to be necessary to look up the associated values b; c : : : in order to delete them from the other dictionaries. This takes (logB N) I/Os, well above the per-dictionary write- optimization budget of O( So the total deletion cost is O(logB N + L In short, for deletions, write optimization offers an ad- vantage over B-trees in that L multiplies a lower order term, but when L = 2, write optimization seems to offer no asymp- totic advantage over B-trees. That is, no known query- optimal solution for pairs of cross-referenced dictionaries seem to beat B-trees for deletions. In this paper, we show a lower bound establishing that a pair of cross-referenced dictionaries that are optimal for range queries and that supports deletions cannot match the write optimization bound available to insert-only dictionar- ies. This result thus establishes a limit to the applicability of write-optimization techniques on which many new databases and file systems are based.

UR - http://www.scopus.com/inward/record.url?scp=85016230881&partnerID=8YFLogxK

M3 - Article in proceedings

AN - SCOPUS:85016230881

SP - 1523

EP - 1532

BT - 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017

A2 - Klein , P.N.

PB - Association for Computing Machinery

T2 - 28th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017

Y2 - 16 January 2017 through 19 January 2017

ER -