Kernel-Assisted Copy-on-Write Snapshots for Main-Memory HTAP Databases

Abstract

Conventional database management systems (DBMS) are typically oriented towards either online transactional processing (OLTP) or online-analytical processing (OLAP). Recently, this dichotomy has been upended by the emergence of so-called hybrid transactional/analytical processing (HTAP) DBMS that accommodate both analytical and transactional queries in a single database. An early representative of this category is the main-memory system HyPer which used the copy-on-write semantics of the POSIX system call fork to isolate OLAP workloads to a transaction-consistent virtual memory snapshot of the primary database.

This thesis revisits the idea of virtual memory copy-on-write snapshots for HTAP workloads in main-memory databases. We explore two alternative kernel-supported snapshot mechanisms, scoot and an extension of mremap, and draw a comparison to fork based on a set of criteria. To this end, we present ScooterDB, an in-memory relational hybrid storage engine that efficiently and transparently supports multiple snapshot mechanisms in a single codebase. We run extensive experiments based on the popular TPC-H and YCSB benchmarks and discuss the strengths and weaknesses of the analysed methods. ScooterDB achieves 23% lower average OLTP latency and up to 75% lower snapshot creation times using custom fine-granular snapshot mechanisms when compared to fork.