Schloss Dagstuhl Seminar: “Robust Performance in Database Query Processing” ‒ DIAS ‐ EPFL

Angelos and Tahir attended the Dagstuhl seminar on “Robust Performance in Database Query Processing” from May 28 to June 2, 2017 https://www.dagstuhl.de/en/program/calendar/semhp/?semnr=17222. Schloss Dagstuhl is a meeting center for computer science research located in Saarland, Germany. Since its foundation in 1990, it has hosted several seminars and workshops, bringing together researchers and practitioners from both industry and academia. Each seminar is typically one week long, from Sunday afternoon to Friday noon. Since participants have living and working facilities in the same building, they get the chance to spend the whole week working and living in close quarters. This helps to stimulate not only their research, but also their social interaction. The venue is surrounded by a forest which provides a great recreational opportunity to go for long hikes around the woods and on nearby walking trails.

The seminar on “Robust Performance in Database Query Processing” is the successor of two older seminars on the same topic, which took place in 2012 and 2010. Organized by Renata Borovica-Gajic, Goetz Graefe and Allison Lee, the goal of the seminar was to come up with ideas for research projects to enable robust performance in database systems. The term “robust performance” was itself controversial; in the end, everyone agreed to an informal definition of “good performance every time”. So the goal of each research project was to reduce, or ideally, eliminate performance disruptions in database systems that may be caused for a variety of reasons.

In the current seminar that we attended, there were 25 participants, who split into four working groups, focusing on (1) the optimal sequencing of operators in query execution, (2) database updates and associated robustness issues, (3) the parallelization of workloads in the face of severe skew, and (4) the application of machine learning in order to better understand the performance of database performance during query execution. Each working group was responsible for delivering performance metrics and benchmarks and framing solutions for the problem it was focusing on. Days were split into sessions where people met only within their group, and sessions where all of the participants met together in order to share their findings, ask questions from other groups and receive feedback on how to proceed. This gave us the opportunity to take a good glimpse on what everybody had been doing and grasp the basic concepts behind their ideas. We were involved in the working groups focusing on parallelization and skew, and on learning to identify “non-robust” behavior, so we will describe our experiences in these two groups.

Angelos was part of the working group that focused on parallelization and skew. The group first approached the issue of having the right benchmark as well as the appropriate metric, in order to stress and evaluate the robustness of a database system running parallel joins in the presence of severe skew. The group members, consisting of researchers from both the academia and industry, came up with novel ideas on how to assess the robustness of a database system that, even though primarily focuses on skewed workloads, it can potentially be extended into a more general context. Moreover, they started the specification of a benchmark that can be used to generate data with various forms of skew and provide with a concrete model of workload that can be used to stress and evaluate the robustness of the system in the presence of these forms of skew. Finally, the group members provided with an outline of works done in the literature, going back to the 90’s, which have addressed the problems of skewed workloads from various perspectives.

Tahir participated in a working group whose goal was to automatically identify queries exhibiting unexpectedly slow performance and fix the underlying reasons for the slow performance. It soon became clear that this was a massive undertaking, since even the definition of a “slow” query was unclear and there was a whole host of reasons why a query might be slow. So, the group focused instead on making slow query performance explainable to users, where a slow query was defined as a query whose performance a user complained about. Based on this definition, the idea was to collect statistics and build performance models for each operator in a query, so that a user could be shown a visual explanation for a slow query. To flesh out this idea further, the group spent substantial amounts of time creating a taxonomy of possible causes of slow performance, coming up with possible benchmarks for experimental validation and reviewing related work in the area of modeling query performance.

Overall, in our opinion, the seminar was a great success. All of the participants were excited about the progress they made during their one week there, sharing their research ideas and trying to provide solutions in an important, high-impact area of database systems. Moreover, social interactions among the participants were stimulating, good-natured and positive, making it a very enjoyable experience overall.

by Tahir Azim & Angelos Anadiotis