MySQL的SQL优化

2023年1月29日 164次阅读来源: 罗志贇

MySQL的SQL优化

通过 show status 命令了解各种 SQL 的执行频率

Com_select：执行 select 操作的次数，一次查询只累加 1。

Com_insert：执行 INSERT 操作的次数，对于批量插入的 INSERT 操作，只累加一次。

Com_update：执行 UPDATE 操作的次数。

Com_delete：执行 DELETE 操作的次数。

上面这些参数对于所有存储引擎的表操作都会进行累计。下面这几个参数只是针对

InnoDB 存储引擎的，累加的算法也略有不同。

Innodb_rows_read：select 查询返回的行数。

Innodb_rows_inserted：执行 INSERT 操作插入的行数。

Innodb_rows_updated：执行 UPDATE 操作更新的行数。

Innodb_rows_deleted：执行 DELETE 操作删除的行数。

通过 EXPLAIN 分析低效 SQL 的执行计划

mysql> explain select sum(moneys) from sales a,company b where a.company_id = b.id and a.year
= 2006\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: b
type: ref
possible_keys: ind_company_id
key: ind_company_id
key_len: 5
ref: sakila.a.company_id
rows: 1
Extra: Using where; Using index
2 rows in set (0.00 sec)

select_type：表示 SELECT 的类型，常见的取值有 SIMPLE（简单表，即不使用表连接或者子查询）、PRIMARY（主查询，即外层的查询）、UNION（UNION 中的第二个或者后面的查询语句）、SUBQUERY（子查询中的第一个 SELECT）等。

table：输出结果集的表。

type：表示表的连接类型，性能由好到差的连接类型为 system（表中仅有一行，即常量表）、const（单表中最多有一个匹配行，例如 primary key 或者 unique index）、eq_ref（对于前面的每一行，在此表中只查询一条记录，简单来说，就是多表连接中使用primary key或者unique index）、ref （与eq_ref类似，区别在于不是使用primarykey 或者 unique index，而是使用普通的索引）、ref_or_null（与 ref 类似，区别在于条件中包含对 NULL 的查询）index_merge(索引合并优化)、unique_subquery（in的后面是一个查询主键字段的子查询）、index_subquery （与 unique_subquery 类似，区别在于 in 的后面是查询非唯一索引字段的子查询）、range（单表中的范围查询）、index（对于前面的每一行，都通过查询索引来得到数据）、all （对于前面的每一行，都通过全表扫描来得到数据）。

possible_keys：表示查询时，可能使用的索引。

key：表示实际使用的索引。

key_len：索引字段的长度。

rows：扫描行的数量。

Extra：执行情况的说明和描述。

在上面的例子中，已经可以确认是对 a 表的全表扫描导致效率的不理想，那么对 a 表的year 字段创建索引，具体如下：

mysql> create index ind_sales2_year on sales2(year);
Query OK, 1000 rows affected (0.03 sec)
Records: 1000 Duplicates: 0 Warnings: 0

mysql> explain select sum(moneys) from sales2 a,company2 b where a.company_id = b.id and
a.year = 2006\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: ref
possible_keys: ind_sales2_year
key: ind_sales2_year
key_len: 2
ref: const
rows: 1
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: b
type: ref
possible_keys: ind_company2_id
key: ind_company2_id
key_len: 5
ref: sakila.a.company_id
rows: 1
Extra: Using where; Using index
2 rows in set (0.00 sec)

可以发现建立索引后对 a 表需要扫描的行数明显减少（从 1000 行减少到 1 行），可见索引的使用可以大大提高数据库的访问速度，尤其在表很庞大的时候这种优势更为明显。

索引问题

MySQL 中索引的存储类型目前只有两种（BTREE 和 HASH），具体和表的存储引擎相关：MyISAM 和 InnoDB 存储引擎都只支持 BTREE 索引；MEMORY/HEAP 存储引擎可以支持 HASH和 BTREE 索引

MySQL 目前不支持函数索引，但是能对列的前面某一部分进索引

mysql> create index ind_company2_name on company2(name(4));
Query OK, 1000 rows affected (0.03 sec)
Records: 1000 Duplicates: 0 Warnings: 0

使用索引

1.对于创建的多列索引，只要查询的条件中用到了最左边的列，索引一般就会被使用，举例说明如下。

mysql> create index ind_sales2_companyid_moneys on sales2(company_id,moneys);
Query OK, 1000 rows affected (0.03 sec)
Records: 1000 Duplicates: 0 Warnings: 0

--然后按 company_id 进行表查询，具体如下：
mysql> explain select * from sales2 where company_id = 2006\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sales2
type: ref
possible_keys: ind_sales2_companyid_moneys
key: ind_sales2_companyid_moneys
key_len: 5
ref: const
rows: 1
Extra: Using where
1 row in set (0.00 sec)

--可以发现即便 where 条件中不是用的 company_id 与 moneys 的组合条件，索引仍然能用到，这就是索引的前缀特性。但是如果只按 moneys 条件查询表，那么索引就不会被用到，具体如下：
mysql> explain select * from sales2 where moneys = 1\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sales2
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra: Using where
1 row in set (0.00 sec)

2.对于使用 like 的查询，后面如果是常量并且只有％号不在第一个字符，索引才可能会被使用，来看下面两个执行计划：

mysql> explain select * from company2 where name like '%3'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: company2
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra: Using where
1 row in set (0.00 sec)
mysql> explain select * from company2 where name like '3%'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: company2
type: range
possible_keys: ind_company2_name
key: ind_company2_name
key_len: 11
ref: NULL
rows: 103
Extra: Using where
1 row in set (0.00 sec)

3.如果对大的文本进行搜索，使用全文索引而不用使用 like ‘%…%’。

4.如果列名是索引，使用 column_name is null 将使用索引。如下例中查询 name 为 null的记录就用到了索引：

mysql> explain select * from company2 where name is null\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: company2
type: ref
possible_keys: ind_company2_name
key: ind_company2_name
key_len: 11
ref: const
rows: 1
Extra: Using where
1 row in set (0.00 sec)

存在索引但不使用索引

1.如果 MySQL 估计使用索引比全表扫描更慢，则不使用索引。例如如果列
key_part1 均匀分布在 1 和 100 之间，下列查询中使用索引就不是很好：

SELECT * FROM table_name where key_part1 > 1 and key_part1 < 90;

2.如果使用 MEMORY/HEAP 表并且 where 条件中不使用“=”进行索引列，那么
不会用到索引。heap 表只有在“=”的条件下才会使用索引。

3.用 or分割开的条件，如果 or前的条件中的列有索引，而后面的列中没有索引，
那么涉及到的索引都不会被用到，例如：

mysql> show index from sales\G;
*************************** 1. row ***************************
Table: sales
Non_unique: 1
Key_name: ind_sales_year
Seq_in_index: 1
Collation: A
Cardinality: NULL
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
1 row in set (0.00 sec)
Column_name: year
--从上面可以发现只有 year 列上面有索引，来看如下的执行计划：
mysql> explain select * from sales where year = 2001 or country = 'China'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sales
type: ALL
possible_keys: ind_sales_year
key: NULL
key_len: NULL
ref: NULL
rows: 12
Extra: Using where
1 row in set (0.00 sec)

4.如果不是索引列的第一部分，如下例子：

mysql> explain select * from sales2 where moneys = 1\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sales2
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra: Using where
1 row in set (0.00 sec)
--可见虽然在 money 上面建有复合索引，但是由于 money 不是索引的第一列，那么在查询中这个索引也不会被 MySQL 采用。

5.如果 like 是以％开始

6.如果列类型是字符串，那么一定记得在 where 条件中把字符常量值用引号引
起来，否则的话即便这个列上有索引，MySQL 也不会用到的，因为，MySQL 默认把输入的常量值进行转换以后才进行检索。如下面的例子中company2表中的name字段是字符型的，但是 SQL 语句中的条件值 294 是一个数值型值，因此即便在 name 上有索引，MySQL 也不能正确地用上索引，而是继续进行全表扫描。

mysql> explain select * from company2 where name = 294\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: company2
type: ALL
possible_keys: ind_company2_name
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra: Using where
1 row in set (0.00 sec)
mysql> explain select * from company2 where name = '294'\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: company2
type: ref
possible_keys: ind_company2_name
key: ind_company2_name
key_len: 23
ref: const
rows: 1
Extra: Using where
1 row in set (0.00 sec)

查看索引使用情况

如果索引正在工作，Handler_read_key 的值将很高，这个值代表了一个行被索引值读的次数，很低的值表明增加索引得到的性能改善不高，因为索引并不经常使用。
Handler_read_rnd_next 的值高则意味着查询运行低效，并且应该建立索引补救。这个值的含义是在数据文件中读下一行的请求数。如果正进行大量的表扫描，Handler_read_rnd_next 的值较高，则通常说明表索引不正确或写入的查询没有利用索引，具体如下。

mysql> show status like 'Handler_read%';
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| Handler_read_first | 0 |
| Handler_read_key | 5 |
| Handler_read_next | 0 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_next | 2055 |
+-----------------------+-------+
6 rows in set (0.00 sec)

两个简单实用的优化方法

1.定期分析表和检查表

分析的结果将可以使得系统得到准确的统计信息，使得 SQL 能够生成正确的执行计划。

--分析表的语法如下：
ANALYZE [LOCAL | NO_WRITE_TO_BINLOG] TABLE tbl_name [, tbl_name] ...

mysql> analyze table sales;
+--------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------------+---------+----------+----------+
| sakila.sales | analyze | status | OK |
+--------------+---------+----------+----------+
1 row in set (0.00 sec)

--检查表的语法如下：
CHECK TABLE tbl_name [, tbl_name] ... [option] ... option = {QUICK | FAST | MEDIUM | EXTENDED
| CHANGED}

--检查表的作用是检查一个或多个表是否有错误。CHECK TABLE对MyISAM和InnoDB表有作用。对于 MyISAM 表，关键字统计数据被更新，例如：
mysql> check table sales;
+--------------+-------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------------+-------+----------+----------+
| sakila.sales | check | status | OK |
+--------------+-------+----------+----------+
1 row in set (0.00 sec)

CHECK TABLE 也可以检查视图是否有错误，比如在视图定义中被引用的表已不存在，举例如下

--1）首先我们创建一个视图
mysql> create view sales_view3 as select * from sales3;
Query OK, 0 rows affected (0.00 sec)

--2）然后 CHECK 一下该视图，发现没有问题。
mysql> check table sales_view3;
+--------------------+-------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------------------+-------+----------+----------+
| sakila.sales_view3 | check | status | OK |
+--------------------+-------+----------+----------+
1 row in set (0.00 sec)

--3）现在删除掉视图依赖的表
mysql> drop table sales3;
Query OK, 0 rows affected (0.00 sec)

--4）再来 CHECK 一下刚才的视图，发现报错了
mysql> check table sales_view3\G;
*************************** 1. row ***************************
Table: sakila.sales_view3
Op: check
Msg_type: error
Msg_text: View 'sakila.sales_view3' references invalid table(s) or column(s) or function(s)
or definer/invoker of view lack rights to use them
1 row in set (0.00 sec)

2.定期优化表

优化表的语法如下：

OPTIMIZE [LOCAL | NO_WRITE_TO_BINLOG] TABLE tbl_name [, tbl_name] ...

如果已经删除了表的一大部分，或者如果已经对含有可变长度行的表（含有 VARCHAR、BLOB 或 TEXT 列的表）进行了很多更改，则应使用 OPTIMIZE TABLE 命令来进行表优化。这个命令可以将表中的空间碎片进行合并，并且可以消除由于删除或者更新造成的空间浪费，但OPTIMIZE TABLE 命令只对 MyISAM、BDB 和 InnoDB 表起作用。

mysql> optimize table sales;
+--------------+----------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+--------------+----------+----------+----------+
| sakila.sales | optimize | status | OK |
+--------------+----------+----------+----------+
1 row in set (0.00 sec)

常用 SQL 的优化

优化 GROUP BY 语句

如果查询包括 GROUP BY 但用户想要避免排序结果的消耗，则可以指定 ORDER BY NULL
禁止排序，如下面的例子：

mysql> explain select id,sum(moneys) from sales2 group by id\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sales2
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra: Using temporary; Using filesort
1 row in set (0.00 sec)
mysql> explain select id,sum(moneys) from sales2 group by id order by null\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sales2
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra: Using temporary
1 row in set (0.00 sec)

从上面的例子可以看出第一个SQL语句需要进行“filesort”，而第二个SQL由于ORDER BY NULL不需要进行“filesort”，而 filesort 往往非常耗费时间。

优化 ORDER BY 语句：

在某些情况中，MySQL 可以使用一个索引来满足 ORDER BY 子句，而不需要额外的排序。WHERE 条件和 ORDER BY 使用相同的索引，并且 ORDER BY 的顺序和索引顺序相同，并且ORDER BY 的字段都是升序或者都是降序。
例如，下列 SQL 可以使用索引。

SELECT * FROM t1 ORDER BY key_part1,key_part2,... ;
SELECT * FROM t1 WHERE key_part1=1 ORDER BY key_part1 DESC, key_part2 DESC;
SELECT * FROM t1 ORDER BY key_part1 DESC, key_part2 DESC;

但是在以下几种情况下则不使用索引：

SELECT * FROM t1 ORDER BY key_part1 DESC, key_part2 ASC；
--order by 的字段混合 ASC 和 DESC
SELECT * FROM t1 WHERE key2=constant ORDER BY key1；
--用于查询行的关键字与 ORDER BY 中所使用的不相同
SELECT * FROM t1 ORDER BY key1, key2；
--对不同的关键字使用 ORDER BY：

优化嵌套查询

有些情况下，子查询可以被更有效率的连接（JOIN）替代

mysql> explain select * from sales2 where company_id not in ( select id from
company2 )\G;
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: sales2
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: company2
type: index_subquery
possible_keys: ind_company2_id
key: ind_company2_id
key_len: 5
ref: func
rows: 2
Extra: Using index
2 rows in set (0.00 sec)

如果使用连接（JOIN）来完成这个查询工作，速度将会快很多。尤其是当 company2 表
中对 id 建有索引的话，性能将会更好，具体查询如下：

mysql> explain select * from sales2 left join company2 on sales2.company_id =
company2.id where sales2.company_id is null\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sales2
type: ref
possible_keys: ind_sales2_companyid_moneys
key: ind_sales2_companyid_moneys
key_len: 5
ref: const
rows: 1
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: company2
type: ref
possible_keys: ind_company2_id
key: ind_company2_id
key_len: 5
ref: sakila.sales2.company_id
rows: 1
Extra:
2 rows in set (0.00 sec)

连接（JOIN）之所以更有效率一些，是因为 MySQL 不需要在内存中创建临时表来完成这
个逻辑上的需要两个步骤的查询工作。

MySQL 如何优化 OR 条件

对于含有 OR 的查询子句，如果要利用索引，则 OR 之间的每个条件列都必须用到索引；
如果没有索引，则应该考虑增加索引。
例如，首先使用 show index 命令查看表 sales2 的索引，可知它有 3 个索引，在 id、year
两个字段上分别有 1 个独立的索引，在 company_id 和 year 字段上有 1 个复合索引。

--然后在两个独立索引上面做 OR 操作，具体如下：
mysql> explain select * from sales2 where id = 2 or year = 1998\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sales2
type: index_merge
possible_keys: ind_sales2_id,ind_sales2_year
key: ind_sales2_id,ind_sales2_year
key_len: 5,2
ref: NULL
rows: 2
Extra: Using union(ind_sales2_id,ind_sales2_year); Using where
1 row in set (0.00 sec)

可以发现查询正确的用到了索引，并且从执行计划的描述中，发现 MySQL 在处理含有 OR
字句的查询时，实际是对 OR 的各个字段分别查询后的结果进行了 UNION。
但是当在建有复合索引的列company_id 和 moneys上面做 OR 操作的时候，却不能用到索引，具体结果如下：

mysql> explain select * from sales2 where company_id = 3 or moneys = 100\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sales2
type: ALL
possible_keys: ind_sales2_companyid_moneys
key: NULL
key_len: NULL
ref: NULL
rows: 1000
Extra: Using where
1 row in set (0.00 sec)

    原文作者：罗志贇
    原文地址: https://www.jianshu.com/p/443cc72faeb5
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。