As main-memory sizes have grown, data systems have been able to process entire large-scale data-sets in memory. However, because memory speeds have been not been keeping pace with CPU speeds, the cost of moving data into CPU caches has begun to dominate certain operations within in-memory data systems. Recent advances in hardware architectures point to near memory computation capabilities becoming possible soon. This allows us to rethink how database systems process queries and how they split computation across the various computational units. In this paper, we present JAFAR, a near data processing accelerator for pushing selects down to memory. Through a detailed simulation of JAFAR hardware we show it has the potential to provide up to 900% improvement for select operations in modern column-stores.
S. Pantella and S. Idreos, “One Loop Does Not Fit All,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, Melbourne, Australia, 2015.Abstract
Just-In-Time (JIT) compilation increasingly becomes a key technology for modern database systems. It allows the creation of code on-the-fly to perfectly match an active query. In the past, it has been argued that a query should be compiled to a single loop that performs all query actions, for example, all selects over all relevant columns. On the other hand, vectorization – a common feature in modern data systems – allows for better results by evaluating the query predicates sequentially in different tight for-loops.
In this paper, we study JIT compilation for modern in- memory column-stores in detail and we show that, contrary to the common belief that vectorization outweighs the benefits of having one loop, there are cases in which creating a single loop is actually the optimal solution. In fact, deciding between multiple or a single loop is not a static decision; instead, it depends on (per column) query selectivity. We perform our experiments on a modern column-store prototype that supports vectorization and we show that, depending on selectivity, a different code layout is optimal. When a select operator is implemented with a no-branch design, for low selectivity creating multiple loops performs better than a single loop. A single tight loop performs better otherwise.
Curiosity, a fundamental drive amongst higher living organisms, is what enables exploration, learning and creativity. In our increasingly data-driven world, data exploration, i.e., making sense of mounting haystacks of data, is akin to intelligence for science, business and individuals. However, modern data systems – designed for data retrieval rather than exploration – only let us retrieve data and ask if it is interesting. This makes knowledge discovery a game of hit-and-trial which can only be orchestrated by expert data scientists.
We present the vision toward Queriosity, an automated and personalized data exploration system. Designed on the principles of autonomy, learning and usability, Queriosity envisions a paradigm shift in data exploration and aims to become a a personalized “data robot” that provides a direct answer to what is interesting in a user’s data set, instead of just retrieving data. Queriosity autonomously and continuously navigates toward interesting findings based on trends, statistical properties and interactive user feedback.
DASlab is a new laboratory at the Harvard School of Engineering and Applied Sciences (SEAS). The lab was formed in January 2014 when Stratos Idreos joined Harvard SEAS. DASlab currently consists of 3 PhD students, 1 postdoctoral researcher and 9 undergraduate researchers while it is set to double its graduate student population in the next one to two years. The lab is part of a growing community of systems and computer science researchers at Harvard; computer science faculty is scheduled to grow by 50\% in the next few years.
The main focus of DASlab is on designing data systems that (a) make it easy to extract knowledge out of increasingly diverse and growing data sets and (b) can stand the test of time.
As data collections become larger and larger, users are faced with increasing bottlenecks in their data analysis. More data means more time to prepare and to load the data into the database before executing the desired queries. Many applications already avoid using database systems, e.g., scientific data analysis and social networks, due to the complexity and the increased data-to-query time, i.e., the time between getting the data and retrieving its first useful results. For many applications data collections keep growing fast, even on a daily basis, and this data deluge will only increase in the future, where it is expected to have much more data than what we can move or store, let alone analyze.
We here present the design and roadmap of a new paradigm in database systems, called NoDB, which do not require data loading while still maintaining the whole feature set of a modern database system. In particular, we show how to make raw data files a first-class citizen, fully integrated with the query engine. Through our design and lessons learned by implementing the NoDB philosophy over a modern DBMS, we discuss the fundamental limitations as well as the strong opportunities that such a research path brings. We identify performance bottlenecks specific for in situ processing, namely the repeated parsing and tokenizing overhead and the expensive data type conversion. To address these problems, we introduce an adaptive indexing mechanism that maintains positional information to provide efficient access to raw data files, together with a flexible caching structure. We conclude that NoDB systems are feasible to design and implement over modern DBMS, bringing an unprecedented positive effect in usability and performance.
Data exploration is about efficiently extracting knowledge from data even if we do not know exactly what we are looking for. This has numerous side-effects on (a) how we design database systems at their core, i.e., at the storage and query processing layers and (b) how users or applications interact with systems.
In this tutorial, we survey recent developments in the emerging area of database systems tailored for data exploration. We discuss new ideas on how to store and access data as well as new ideas on how to interact with a data system to enable users and applications to quickly figure out which data parts are of interest. In addition, we discuss how to exploit lessons-learned from past research, the new challenges data exploration crafts, emerging applications and future directions.
Great database systems performance relies heavily on index tuning, i.e., creating and utilizing the best indices depending on the workload. However, the complexity of the index tuning process has dramatically increased in recent years due to ad-hoc workloads and shortage of time and system resources to invest in tuning.
This paper introduces holistic indexing, a new approach to automated index tuning in dynamic environments. Holistic indexing requires zero set-up and tuning effort, relying on adaptive index creation as a side-effect of query processing. Indices are created incrementally and partially; they are continuously refined as we process more and more queries. Holistic indexing takes the state-of-the-art adaptive indexing ideas a big step further by introducing the notion of a system which never stops refining the index space, taking educated decisions about which index we should incrementally refine next based on continuous knowledge acquisition about the running workload and resource utilization. When the system detects idle CPU cycles, it utilizes those extra cycles by refining the adaptive indices which are most likely to bring a benefit for future queries. Such idle CPU cycles occur when the system cannot exploit all available cores up to 100%, i.e., either because the workload is not enough to saturate the CPUs or because the current tasks performed for query processing are not easy to parallelize to the point where all available CPU power is exploited.
In this paper, we present the design of holistic indexing for column-oriented database architectures and we discuss a detailed analysis against parallel versions of state-of-the-art indexing and adaptive indexing approaches. Holistic indexing is implemented in an open-source column-store DBMS. Our detailed experiments on both synthetic and standard benchmarks (TPC-H) and workloads (SkyServer) demonstrate that holistic indexing brings significant performance gains by being able to continuously refine the physical design in parallel to query processing, exploiting any idle CPU resources.
Query optimizers depend heavily on statistics representing column distributions to create efficient query plans. In many cases, though, statistics are outdated or non-existent, and the process of refreshing statistics is very expensive, especially for ad-hoc workloads on ever bigger data. This results in suboptimal plans that severely hurt performance. The main problem is that any decision, once made by the optimizer, is fixed throughout the execution of a query. In particular, each logical operator translates into a fixed choice of a physical operator at run-time.
In this paper we advocate for continuous adaptation and morphing of physical operators throughout their lifetime, by adjusting their behavior in accordance with the statistical properties of the data. We demonstrate the benefits of the new paradigm by designing and implementing an adaptive access path operator called Smooth Scan, which morphs continuously within the space of traditional index access and full table scan. Smooth Scan behaves similarly to an index scan for low selectivity; if selectivity increases, however, Smooth Scan progressively morphs its behavior toward a sequential scan. As a result, a system with Smooth Scan requires no access path decisions up front nor does it need accurate statistics to provide good performance. We implement Smooth Scan in PostgreSQL and, using both synthetic benchmarks as well as TPC-H, we show that it achieves robust performance while at the same time being statistics-oblivious.
Database Cracking is an appealingly simple approach to adaptive indexing: on every range-selection query, the data is partitioned using the supplied predicates as pivots. The core of database cracking is, thus, pivoted partitioning. While pivoted partitioning, like scanning, requires a single pass through the data it tends to have much higher costs due to lower CPU efficiency. In this paper, we conduct an in-depth study of the reasons for the low CPU efficiency of pivoted partitioning. Based on the findings, we develop an optimized version with significantly higher (single-threaded) CPU efficiency. We also develop a number of multi-threaded implementations that are effectively bound by memory bandwidth. Combining all of these optimizations we achieve an implementation that has costs close to or better than an ordinary scan on a variety of systems ranging from low-end (cheaper than $300) desktop machines to high-end (above $10,000) servers.
Compression has historically been used to reduce the cost of storage, I/Os from that storage, and buffer pool utilization, at the expense of the CPU required to decompress data every time it is queried. However, significant additional CPU efficiencies can be achieved by deferring decompression as late in query processing as possible and performing query processing operations directly on the still-compressed data. In this paper, we investigate the benefits and challenges of performing joins on compressed (or encoded) data. We demonstrate the benefit of independently optimizing the compression scheme of each join column, even though join predicates relating values from multiple columns may require translation of the encoding of one join column into the encoding of the other. We also show the benefit of compressing “payload” data other than the join columns “on the fly,” to minimize the size of hash tables used in the join. By partitioning the domain of each column and defining separate dictionaries for each partition, we can achieve even better overall compression as well as increased flexibility in dealing with new values introduced by updates. Instead of decom- pressing both join columns participating in a join to resolve their different compression schemes, our system performs a light-weight mapping of only qualifying rows from one of the join columns to the encoding space of the other at run time. Consequently, join predicates can be applied directly on the compressed data. We call this procedure encoding translation. Two alternatives of encoding translation are developed and compared in the paper. We provide a comprehensive evaluation of these alternatives using product implementations of each on the TPC-H data set, and demonstrate that performing joins on encoded and partitioned data achieves both su- perior performance and excellent compression.
Adaptive indexing initializes and optimizes indexes incrementally, as a side effect of query processing. The goal is to achieve the benefits of indexes while hiding or minimizing the costs of index creation. However, index-optimizing side effects seem to turn read-only queries into update transactions that might, for example, create lock contention. This paper studies concurrency control and recovery in the context of adaptive indexing. We show that the design and implementation of adaptive indexing rigorously separates index structures from index contents; this relaxes constraints and requirements during adaptive indexing compared to those of traditional index updates. Our design adapts to the fact that an adaptive index is refined continuously and exploits any concurrency opportunities in a dynamic way. A detailed experimental analysis demonstrates that (a) adaptive indexing maintains its adaptive properties even when running concurrent queries, (b) adaptive indexing can exploit the opportunity for parallelism due to concurrent queries, (c) the number of concurrency conflicts and any concurrency administration overheads follow an adaptive behavior, decreasing as the workload evolves and adapting to the workload needs.
We study the problem of distributed resource sharing in peer-to-peer networks and focus on the problem of information filter- ing. In our setting, subscriptions and publications are specified using an expressive attribute-value representation that supports both the Boolean and Vector Space models. We use an extension of the distributed hash table Chord to organise the nodes and store user subscriptions, and utilise efficient publication protocols that keep the network traffic and latency low at filtering time. To verify our approach, we evaluate the proposed protocols experimentally using thousands of nodes, millions of user sub- scriptions, and two different real-life corpora. We also study three impor- tant facets of the load-balancing problem in such a scenario and present a novel algorithm that manages to distribute the load evenly among the nodes. Our results show that the designed protocols are scalable and efficient: they achieve expressive information filtering functionality with low message traffic and latency.
Modern state-of-the-art database systems are designed around a single data storage layout. This is a fixed decision that drives the whole design of the architecture of a database system, i.e., row-stores, column-stores, etc. However, none of those choices is a universally good solution; different workloads require different storage layouts and data access methods in order to achieve good performance. In this paper, we present the H2O system which introduces two novel concepts. First, it is flexible to support multiple storage layouts and data access patterns in a single engine. Second, and most importantly, it decides on-the-fly, i.e., during query processing, which design is best for classes of queries and the respective data parts. At any given point in time, parts of the data might be materialized in various patterns purely depending on the query workload; as the workload changes and with every single query, the storage and access patterns continuously adapt. In this way, H2O makes no a priori and fixed decisions on how data should be stored, allowing each single query to enjoy a storage and access pattern which is tailored to its specific properties. We present a detailed analysis of H2O using both synthetic benchmarks and realistic scientific workloads such as the SDSS (SkyServer). We demonstrate that while existing systems cannot achieve maximum performance across all workloads, H2O can always match the best case performance without requiring any tuning or workload knowledge.
Numerous applications continuously produce big amounts of data series, and in several time critical scenarios analysts need to be able to query these data as soon as they become available, which is not currently possible with the state-of-the-art indexing methods and for very large data series collections. In this paper, we present the first adaptive indexing mechanism, specifically tailored to solve the problem of indexing and querying very large data series collections. The main idea is that instead of building the complete index over the complete data set up-front and querying only later, we interactively and adaptively build parts of the index, only for the parts of the data on which the users pose queries. The net effect is that instead of waiting for extended periods of time for the index creation, users can immediately start exploring the data series. We present a detailed design and evaluation of adaptive data series indexing over both synthetic data and real-world workloads. The results show that our approach can gracefully handle large data series collections, while drastically reducing the data to query delay: by the time state-of-the-art indexing techniques finish indexing 1 billion data series (and before answering even a single query), adaptive data series indexing has already answered 3∗105 queries.
A fundamental need in the era of data deluge is data exploration through interactive tools, i.e., being able to quickly determine data and patterns of interest. dbTouch is a new research direction towards a next generation of data management systems that inherently support data exploration by allowing touch-based interaction. Data is represented in a visual format, while users can touch those shapes and interact/query with gestures. In a dbTouch system, the whole database kernel is geared towards quick responses in touch input; the user drives query processing (not just query construction) via touch gestures, dictating how fast or slow data flows through query plans and which data parts are processed at any time. dbTouch translates the gestures into interactive database operators, reacting continuously to the touch input and analytics tasks given by the user in real-time such as sliding a finger over a column to scan it progressively; zoom in with two fingers over a column to progressively get sample data; rotate a table to change the physical design from row-store to column-store, etc. This demo presents the first dbTouch prototype over iOS for iPad.
We are now entering the era of data deluge, where the
amount of data outgrows the capabilities of query processing technology.
Many emerging applications, from social networks to scientific experiments,
are representative examples of this deluge, where the rate at which data is
produced exceeds any past experience. For example, scientific analysis such
as astronomy is soon expected to collect multiple Terabytes of data on a daily
basis, while already web-based businesses such as social networks or web
log analysis are confronted with a growing stream of large data inputs.
Therefore, there is a clear need for efficient big data query processing to
enable the evolution of businesses and sciences to the new era of data
In this chapter, we focus on a new direction of query processing for big data
where data exploration becomes a first class citizen. Data exploration is
necessary when new big chunks of data arrive rapidly and we want to react
quickly, i.e., with little time to spare for tuning and set-up. In particular, our
discussion focuses on database systems technology, which for several
decades has been the predominant data processing tool.
In this chapter, we introduce the concept of data exploration and we discuss a
series of early techniques from the database community towards the direction
of building database systems which are tailored for big data exploration, i.e.,
adaptive indexing, adaptive loading and sampling-based query processing.
These directions focus on reconsidering fundamental assumptions and on
designing next generation database architectures for the big data era.
As we enter the era of data deluge,
turning data into knowledge has become the major challenge across most sciences and businesses that deal with data.
In addition, as we increase our ability to create data, more and more people are confronted
with data management problems on a daily basis for numerous aspects of every day life.
A fundamental need is data exploration through interactive tools, i.e.,
being able to quickly and effortlessly determine data and patterns of interest.
However, modern database systems have not been designed with data exploration and usability in mind;
they require users with expert knowledge and skills,
while they react in a strict and monolithic way to every user request, resulting in correct answers but slow response times.
In this paper, we introduce the vision of a new generation of data management systems, called dbTouch;
our vision is to enable interactive and intuitive data exploration via database kernels
which are tailored for touch-based exploration.
No expert knowledge is needed.
Data is represented in a visual format, e.g., a column shape for an attribute or a fat rectangle shape for a table,
while users can touch those shapes and interact/query with gestures as opposed to firing complex SQL queries.
The system does not try to consume all data; instead it analyzes only parts of the data at a time,
continuously refining the answers and continuously reacting to user input.
Every single touch on a data object can be seen as a request to run an operator or a collection of operators
over part of the data. Users react to running results and continuously adjust the data exploration -
they continuously determine the data to be processed next
by adjusting the direction and speed of a gesture, i.e., a collection of touches;
the database system does not have control on the data flow anymore.
We discuss the various benefits that dbTouch systems bring for data analytics as well as the
new and unique challenges for database research in combination with touch interfaces.
In addition, we provide an initial architecture, implementation and evaluation (and demo) of a dbTouch prototype
over IOs for IPad.
Continuous query processing has emerged as
a promising query processing paradigm with numerous applications.
A recent development is the need to handle both streaming queries and typical one-time queries
in the same application. For example, data warehousing can greatly benefit from the
integration of stream semantics, i.e., online analysis of incoming data and combination with existing data.
This is especially useful to provide low latency in data-intensive analysis
in big data warehouses that are augmented with new data on a daily basis.
However, state-of-the-art database technology cannot handle streams efficiently due to their
At the same time, state-of-the-art stream technology is purely focused on stream applications.
The research efforts are mostly geared towards the creation of specialized stream management systems
built with a different philosophy than a DBMS.
The drawback of this approach is the limited opportunities to exploit
successful past data processing technology, e.g., query optimization techniques.
For this new problem we need to combine the best of both worlds.
Here we take a completely different route by designing a
stream engine on top of an existing relational database kernel.
This includes reuse of both its storage/execution engine and its optimizer
infrastructure. The major challenge then becomes the efficient support
for specialized stream features.
This paper focuses on incremental window-based processing, arguably the most crucial
In order to maintain and reuse the generic storage and execution model of the DBMS,
we elevate the problem at the query plan level.
Proper optimizer rules, scheduling and intermediate result caching and reuse,
allow us to modify the DBMS query plans for efficient incremental processing.
We describe in detail the new approach and we demonstrate
efficient performance even against specialized stream engines, especially when scalability
becomes a crucial factor
In this article, we survey recent research on column-oriented database systems, or column-stores, where each attribute of a table is stored in a separate file or region on storage. Such databases have seen a resurgence in recent years with a rise in interest in analytic queries that perform scans and aggregates over large portions of a few columns of a table. The main advantage of a column-store is that it can access just the columns needed to answer such queries. We specifically focus on three influential research prototypes, MonetDB, VectorWise, and C-Store. These systems have formed the basis for several well-known commercial column-store implementations. We describe their similarities and differences and discuss their specific architectural features for compression, late materialization, join processing, vectorization and adaptive indexing (database cracking).
Physical design represents one of the hardest problems for database management systems. Without proper tuning, systems cannot achieve good performance. Offline indexing creates indexes a priori assuming good workload knowledge and idle time. More recently, online indexing monitors the workload trends and creates or drops indexes online. Adaptive indexing takes another step towards completely automating the tuning process of a database system, by enabling incremental and partial online indexing. The main idea is that physical design changes continuously, adaptively, partially, incrementally and on demand while processing queries as part of the execution operators. As such it brings a plethora of opportunities for rethinking and improving every single corner of database system design.
We will analyze the indexing space between offline, online and adaptive indexing through several state of the art indexing techniques, e. g., what-if analysis and soft indexes. We will discuss in detail adaptive indexing techniques such as database cracking, adaptive merging, sideways cracking and various hybrids that try to balance the online tuning overhead with the convergence speed to optimal performance. In addition, we will discuss how various aspects of modern techniques for database architectures, such as vectorization, bulk processing, column-store execution and storage affect adaptive indexing. Finally, we will discuss several open research topics towards fully autonomous database kernels.