S. Idreos, “
Database Cracking: Towards Auto-tuning Database Kernels,” 2010.
AbstractIndices are heavily used in database systems in order to achieve the ultimate
query processing performance. It takes a lot of time to create an index and the
system needs to reserve extra storage space to store the auxiliary data structure.
When updates arrive, there is also the overhead of maintaining the index. This
way, which indices to create and when to create them has been and still is one
of the most important research topics over the last decades.
If the workload is known up-front or it can be predicted and if there is
enough idle time to spare, then we can a priori create all necessary indices and
exploit them when queries arrive. But what happens if we do not have this
knowledge or idle time? Similarly, what happens if the workload changes often,
suddenly and in an unpredictable way? Even if we can correctly analyze the
current workload, it may well be that by the time we finish our analysis and
create all necessary indices, the workload pattern has changed.
Here we argue that a database system should just be given the data and
queries in a declarative way and the system should internally take care of finding
not only the proper algorithms and query plans but also the proper physical
design to match the workload and application needs. The goal is to remove
the role of database administrators, leading to systems that can completely
automatically self-tune and adapt even to dynamic environments. Database
Cracking implements the first adaptive kernel that automatically adapts to the
access patterns by selectively and adaptively optimizing the data set purely for
the workload at hand. It continuously reorganizes input data on-the-fly as a
side-efect of query processing using queries as an advice of how data should
be stored. Everything happens within operator calls during query processing
and brings knowledge to the system that future operators in future queries can
exploit. Essentially, the necessary indices are built incrementally as the system
gains more and more knowledge about the workload needs.
DBcrackingThesis.pdf