Opportunistic View Materialization with Deep Reinforcement Learning

Materialized views, or pre-computed query results, can greatly speed up query answering in read-mostly database systems. In this project, we explore opportunistic materialization, where a database identifies and caches important intermediate results for future use during execution. Since real systems have storage constraints, the crucial problem is to design selection and eviction strategies.

It can be tricky to design such strategies by hand. Classical buffer management heuristics, like LFU or LRU, only consider usage rates and not the impact the view has for queries in a specific workload. Other heuristics like HAWC [1] and Recycler [2] can be sensitive to query optimizer cost estimates and heuristic choices or both. Rather than hand coding a heuristic, can we allow the database to self-optimize its own materialized view cache?

Inspired by recent work that uses Deep RL to more accurately optimize queries (through learning from execution feedback) [3], we apply similar approach to opportunistic materialization. The system starts with a random selection strategy and observes choices that seem to improve actual query runtime. As it takes decisions it builds a model that correlates these choices to features of the views created, workload, and system state. Decisions that helped in the past are remembered and ones that hurt are forgotten.

The set of possible views are huge and directly applying standard RL algorithms can take a very long time to converge as the aggregate query runtimes are a very weak performance signal. Our insight is that a better view selection policies can be effectively trained with a novel asynchronous RL algorithm which runs paired counterfactual experiments during system idle times to evaluate the incremental value of persisting certain views. It runs the query with and without the view and uses the difference in performance as a feedback signal.

In our research prototype system called DQM, we focus on inner-join views. We implemented a view miner that can generate view candidates by mining the logical plan of queries, a query rewriter that can rewrite a query with a view and a view manager that can manage (create/drop) views. We integrate our system with SparkSQL and build our DeepRL agent for decision making (view selection and eviction) with Python that communicate with the Spark environment via the RESTful API.

We evaluate DQM with workloads derived from the Join-Order-Benchmark and TPC-DS. Results suggest that DQM is more robust across different workloads and performs competitively with a hypothetical near-optimal baseline after training. DQM is especially powerful when there are stochastic effects in the system (e.g., maintenance) that lead to additional costs.