Database Cracking: Fancy Scan, not Poor Man’s Sort!

Citation:

H. Pirk, E. Petraki, S. Idreos, S. Manegold, and M. L. Kersten, “Database Cracking: Fancy Scan, not Poor Man’s Sort!” in Proceedings of the 10th International Workshop on Data Management on New Hardware (DaMoN), Snowbird, Utah, 2014.
crackingscanorsort.pdf1.99 MB

Abstract:

Database Cracking is an appealingly simple approach to adaptive indexing: on every range-selection query, the data is partitioned using the supplied predicates as pivots. The core of database cracking is, thus, pivoted partitioning. While pivoted partitioning, like scanning, requires a single pass through the data it tends to have much higher costs due to lower CPU efficiency. In this paper, we conduct an in-depth study of the reasons for the low CPU efficiency of pivoted partitioning. Based on the findings, we develop an optimized version with significantly higher (single-threaded) CPU efficiency. We also develop a number of multi-threaded implementations that are effectively bound by memory bandwidth. Combining all of these optimizations we achieve an implementation that has costs close to or better than an ordinary scan on a variety of systems ranging from low-end (cheaper than $300) desktop machines to high-end (above $10,000) servers.

Last updated on 06/02/2014