Merging What's Cracked, Cracking What's Merged: Adaptive Indexing in Main-Memory Column-Stores

Citation:

S. Idreos, S. Manegold, H. A. Kuno, and G. Graefe, “Merging What's Cracked, Cracking What's Merged: Adaptive Indexing in Main-Memory Column-Stores,” Proceedings of the Very Large Databases Endowment (PVLDB), vol. 4, pp. 585-597, 2011.

Abstract:

Adaptive indexing is characterized by the partial creation and refinement of the index as side effects of query execution. Dynamic or shifting workloads may benefit from preliminary index structures focused on the columns and specific key ranges actually queried --- without incurring the cost of full index construction. The costs and benefits of adaptive indexing techniques should therefore be compared in terms of initialization costs, the overhead imposed upon queries, and the rate at which the index converges to a state that is fully-refined for a particular workload component. Based on an examination of database cracking and adaptive merging, which are two techniques for adaptive indexing, we seek a hybrid technique that has a low initialization cost and also converges rapidly. We find the strengths and weaknesses of database cracking and adaptive merging complementary. One has a relatively high initialization cost but converges rapidly. The other has a low initialization cost but converges relatively slowly. We analyze the sources of their respective strengths and explore the space of hybrid techniques. We have designed and implemented a family of hybrid algorithms in the context of a column-store database system. Our experiments compare their behavior against database cracking and adaptive merging, as well as against both traditional full index lookup and scan of unordered data. We show that the new hybrids significantly improve over past methods while at least two of the hybrids come very close to the ``ideal performance'' in terms of both overhead per query and convergence to a final state.

Last updated on 12/21/2013