Publications by Type: Thesis

2010
S. Idreos, “Database Cracking: Towards Auto-tuning Database Kernels,” 2010.Abstract
Indices are heavily used in database systems in order to achieve the ultimate query processing performance. It takes a lot of time to create an index and the system needs to reserve extra storage space to store the auxiliary data structure. When updates arrive, there is also the overhead of maintaining the index. This way, which indices to create and when to create them has been and still is one of the most important research topics over the last decades. If the workload is known up-front or it can be predicted and if there is enough idle time to spare, then we can a priori create all necessary indices and exploit them when queries arrive. But what happens if we do not have this knowledge or idle time? Similarly, what happens if the workload changes often, suddenly and in an unpredictable way? Even if we can correctly analyze the current workload, it may well be that by the time we finish our analysis and create all necessary indices, the workload pattern has changed. Here we argue that a database system should just be given the data and queries in a declarative way and the system should internally take care of finding not only the proper algorithms and query plans but also the proper physical design to match the workload and application needs. The goal is to remove the role of database administrators, leading to systems that can completely automatically self-tune and adapt even to dynamic environments. Database Cracking implements the first adaptive kernel that automatically adapts to the access patterns by selectively and adaptively optimizing the data set purely for the workload at hand. It continuously reorganizes input data on-the-fly as a side-efect of query processing using queries as an advice of how data should be stored. Everything happens within operator calls during query processing and brings knowledge to the system that future operators in future queries can exploit. Essentially, the necessary indices are built incrementally as the system gains more and more knowledge about the workload needs.
DBcrackingThesis.pdf