Merging What's Cracked, Cracking What's Merged: Adaptive Indexing in Main-Memory Column-Stores

Publication information:

S. Idreos, S. Manegold, H. Kuno, and G. Graefe,

“Merging What’s Cracked, Cracking What’s Merged: Adaptive Indexing in Main-Memory Column-Stores”, Proceedings of the Very Large Databases Endowment (PVLDB), vol. 4, Art. no. 9, 2011.

Abstract

Adaptive indexing is characterized by the partial creationand refinement of the index as side effects of query execution.Dynamic or shifting workloads may benefit from preliminaryindex structuresfocused on the columns and specific key ranges actually queried ---without incurring the cost of full index construction.The costs and benefits of adaptive indexing techniquesshould therefore be compared in terms of initialization costs,the overhead imposed upon queries, and the rate at whichthe index converges to a state that is fully-refinedfor a particular workload component.Based on an examination of database cracking and adaptive merging,which are two techniques for adaptive indexing,we seek a hybrid technique that has a low initialization costand also converges rapidly.We find the strengths and weaknesses of database cracking andadaptive merging complementary.One has a relatively high initialization cost but converges rapidly.The other has a low initialization cost but converges relatively slowly.We analyze the sources of their respective strengths andexplore the space of hybrid techniques.We have designed and implemented a family of hybrid algorithmsin the context of a column-store database system.Our experiments compare their behavior againstdatabase cracking and adaptive merging, as well as againstboth traditional full index lookup and scan of unordered data.We show that the new hybrids significantly improve overpast methods while at least two of the hybridscome very close to the ``ideal performance''in terms of both overhead per query and convergence to a final state.