Cosine: A Cloud-Cost Optimized Self-Designing Key-Value Storage Engine

Citation:

S. Chatterjee, M. Jagadeesan, W. Qin, and S. Idreos, “Cosine: A Cloud-Cost Optimized Self-Designing Key-Value Storage Engine,” in Proceedings of the Very Large Databases Endowment (PVLDB), 2022.
cosine.pdf4.36 MB

Abstract:

We present a self-designing key-value storage engine, Cosine, which can always take the shape of the close to “perfect” engine architec- ture given an input workload, a cloud budget, a target performance, and required cloud SLAs. By identifying and formalizing the first principles of storage engine layouts and core key-value algorithms, Cosine constructs a massive design space comprising of sextillion (10^36) possible storage engine designs over a diverse space of hardware and cloud pricing policies for three cloud providers – AWS, GCP, and Azure. Cosine spans across diverse designs such as Log-Structured Merge-trees, B-trees, Log-Structured Hash-tables, in-memory accelerators for filters and indexes as well as trillions of hybrid designs that do not appear in the literature or industry but emerge as valid combinations of the above. Cosine includes a unified distribution-aware I/O model and a learned concurrency-aware CPU model that with high accuracy can calculate the performance and cloud cost of any possible design on any workload and virtual machines. Cosine can then search through that space in a matter of seconds to find the best design and materializes the actual code of the resulting storage engine design using a templated Rust imple- mentation. We demonstrate that on average Cosine outperforms state-of-the-art storage engines such as write-optimized RocksDB, read-optimized WiredTiger, and very write-optimized FASTER by 53x, 25x, and 20x, respectively, for diverse workloads, data sizes, and cloud budgets across all YCSB core workloads and many variants.

Last updated on 05/16/2022