智能数据库之智能调参

2024年4月7日 265次阅读来源: Michael

本文是SIGMOD论文解读。Automatic Database Management System Tuning Through Large-scale Machine Learning是CMU教授Andy Pavlo以及其phd学生Dana等人在SIGMOD17发表的智能调参论文，称之为ottertune。该论文引用了另一篇paper“Tuning database configuration parameters with iTuned”发表于VLDB2009。我个人认为从核心idea上并无太大差别。

下面说一下ottertune的流程与核心思想

Let’s take an example. I already have history data with two metrics (innodb_pages_reads, innodb_io_reads), three workloads (TPCC, YCSB, wikipedia) and four configrations. So I can get two matrices:

Matrix1 (innodb_pages_reads)

         conf1  conf2  conf3  conf4
TPCC     20     30     40     50
YCSB     100    NULL   300    400
WIKI     50     60     NULL   80

Matrix2 (innodb_io_reads)

         conf1  conf2  conf3  conf4
TPCC     200    300    400    500
YCSB     100    NULL   300    400
WIKI     500    600    NULL   800

The recommendation steps for target wokload (Aliworkload) would like this:

We run current target workload (Xworkload) with conf1 (as the defaut configuration), the 5 metrics is (11, 20, 31, 40, 51). So the similar workload is TPC-C
Take all of the previous data you have for TPC-C and combine it with all of the data collected so far from the current workload. You use this data to train a GP model (again, the configurations are your input matrix and your target objective metric, such as the latency, is your output matrix. Then, starting with a bunch of sample configurations (let’s say for now they’re randomly generated), use the GP model along with gradient descent to predict the means/variances of the sample points and walk towards the nearest optimum (for latency this would be the nearest minimum). Use an exploration/exploitation tradeoff algorithm like UCB (upper confidence bound, or if using a metric like latency where lower is better, lower confidence bound) to select the next configuration to run. Let’s call this conf2. See https://github.com/cmu-db/ott… and https://github.com/cmu-db/ott… for more details.
Install conf2 on the DBMS and observe the workload for some minutes/hours.
Repeat steps 1 – 3 until satisfied with the improvement. Note that in the next iteration of step 1, you will now predict the metrics for TPC-C, YCSB, & Wiki for both conf1 and conf2, so the workload that you first selected as being the most similar may change over the course of the tuning session.

    原文作者：Michael
    原文地址: https://segmentfault.com/a/1190000018992001
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。