我对查询的执行时间有疑问,这让我感到困惑.
我知道一些方法可以解决问题并获得更好和可接受的执行时间,但仍然没有解决问题发生的原因.
样本表
我们有两个表,由外键相关.
表格1
| Id | IdTable2 |
|:--:|:--------:|
| 1 | 4 |
| 2 | 7 |
| 3 | 8 |
| 4 | 6 |
| 5 | 4 |
| 6 | 1 |
| 7 | 1 |
| 8 | 6 |
| 9 | 7 |
| 10 | 1 |
表2
| Id | ValueField |
|:--:|:----------:|
| 1 | 0 |
| 2 | 0 |
| 3 | 0 |
| 4 | 1 |
| 5 | 0 |
| 6 | 1 |
| 7 | 0 |
询问
SELECT * FROM Table1 WHERE IdTable2 IN (SELECT Id FROM Table2 WHERE ValueField = ?);
哪里?可以是0或1
真实数据计数
上面的表只是一个简化示例,但这些表的实际行数如下:
>表1:60420行
>表2:62行
> Table2,ValueField 0:51行
> Table2,ValueField 1:11行
>表1,IdTable2,ValueField 0:599行
>带有IdTable2的Table1,ValueField 1:59821行
问题
SELECT * FROM Table1 WHERE IdTable2 IN (SELECT Id FROM Table2 WHERE ValueField = 0);
-- Execution time LOW/INSTANT
SELECT * FROM Table1 WHERE IdTable2 IN (SELECT Id FROM Table2 WHERE ValueField = 1);
-- Execution time HIGH
好吧,首先我认为子查询是斗争,但如果子查询是问题,不同的值将不会在如此绝望的时间执行,所以我想可能检索到的数据量是问题,所以我试试这个:
SELECT * FROM Table1 WHERE IdTable2 IN (1,2,3,5,7); -- Equivalent of ValueField 0
-- Execution time LOW/INSTANT
SELECT * FROM Table1 WHERE IdTable2 IN (4,6); -- Equivalent of ValueField 1
-- Execution time LOW/INSTANT
嗯…检索到的数据也不是,让我们尝试别的:
SELECT * FROM Table1 WHERE IdTable2 IN (SELECT Id FROM Table2 WHERE ValueField = 0);
-- Execution time LOW/INSTANT
SELECT * FROM Table1 WHERE IdTable2 NOT IN (SELECT Id FROM Table2 WHERE ValueField = 0);
-- Execution time LOW/INSTANT
如果我扭转它会发生什么?
SELECT * FROM Table1 WHERE IdTable2 NOT IN (SELECT Id FROM Table2 WHERE ValueField = 1);
-- Execution time LOW/INSTANT
SELECT * FROM Table1 WHERE IdTable2 IN (SELECT Id FROM Table2 WHERE ValueField = 0);
-- Execution time LOW/INSTANT
嗯……这几乎告诉我问题不在于子查询和数据,但是为什么与ValueField = 1比较并且使用IN导致问题并且没有其他选择可以复制HIGH执行时间?
执行计划
对于SQL IN ValueField 1:
SELECT * FROM Incidencias WHERE EstadoWorkflow in (SELECT IdEstadoWorkflow FROM EstadosWorkflows WHERE Final = 1);
http://s000.tinyupload.com/index.php?file_id=19036217708532467879
对于SQL IN ValueField 0:
SELECT * FROM Incidencias WHERE EstadoWorkflow in (SELECT IdEstadoWorkflow FROM EstadosWorkflows WHERE Final = 0);
http://s000.tinyupload.com/index.php?file_id=49593927895920014301
对于SQL NOT IN ValueField 0:
SELECT * FROM Incidencias WHERE EstadoWorkflow not in (SELECT IdEstadoWorkflow FROM EstadosWorkflows WHERE Final = 0);
http://s000.tinyupload.com/index.php?file_id=03901091628843565847
对于SQL NOT IN ValueField 1:
SELECT * FROM Incidencias WHERE EstadoWorkflow not in (SELECT IdEstadoWorkflow FROM EstadosWorkflows WHERE Final = 1);
http://s000.tinyupload.com/index.php?file_id=69996775965382534356
查询与我在示例中发布的内容相同,但是使用其他名称,这是示例查询与实际查询的等效字典.
>表1:Incidencias
>表2:EstadosWorkflows
> IdTable2:EstadoWorkflow
> Table2.Id:IdEstadoWorkflow
> ValueField:最终
相反,为了更好的阅读:
> Incidencias:表1
> EstadosWorkflows:表2
> EstadoWorkflow:IdTable2
> IdEstadoWorkflow:Table2.Id
>最终:ValueField
真实生产查询
这些查询与查询计划显示相同的问题,但有额外的昂贵操作(如巨大的存在和连接),问题变得更糟.
我真的希望我没有用简化的例子误导你.
使用值0查询IN
SELECT distinct top 15 this_.IdIncidencia as y0_, this_.Fecha as y1_
FROM Incidencias this_ inner join Usuarios usuario1_ on this_.Usuario=usuario1_.IdUsuario inner join Usuarios_Perfiles perfiles5_ on usuario1_.IdUsuario=perfiles5_.Usuario and (perfiles5_.perfil in (select perfiles.idperfil from perfiles where perfiles.borrado = 0)) inner join Perfiles prf2_ on perfiles5_.Perfil=prf2_.IdPerfil
WHERE
this_.Instancia = 4 and
this_.EstadoWorkflow in (SELECT this_0_.IdEstadoWorkflow as y0_ FROM EstadosWorkflows this_0_ WHERE this_0_.Final = 0) and
exists (SELECT this_0_.IdPerfilPermiso as y0_ FROM Perfiles_Permisos this_0_ inner join Permisos prm1_ on this_0_.Permiso=prm1_.IdPermiso WHERE this_0_.IdPerfilPermiso in (206558, 206559, 209393, 209394) and (this_0_.PerfilAutorizado = prf2_.IdPerfil and this_0_.TipologiaAutorizada = this_.Tipologia and prm1_.Controlador = 'Incidencias' and prm1_.Accion = 'Index'))
ORDER BY this_.Fecha desc
执行时间:266ms.
执行计划:http://s000.tinyupload.com/index.php?file_id=36115325682943356233
使用值1查询IN
SELECT distinct top 15 this_.IdIncidencia as y0_, this_.Fecha as y1_
FROM Incidencias this_ inner join Usuarios usuario1_ on this_.Usuario=usuario1_.IdUsuario inner join Usuarios_Perfiles perfiles5_ on usuario1_.IdUsuario=perfiles5_.Usuario and (perfiles5_.perfil in (select perfiles.idperfil from perfiles where perfiles.borrado = 0)) inner join Perfiles prf2_ on perfiles5_.Perfil=prf2_.IdPerfil
WHERE
this_.Instancia = 4 and
this_.EstadoWorkflow in (SELECT this_0_.IdEstadoWorkflow as y0_ FROM EstadosWorkflows this_0_ WHERE this_0_.Final = 1) and
exists (SELECT this_0_.IdPerfilPermiso as y0_ FROM Perfiles_Permisos this_0_ inner join Permisos prm1_ on this_0_.Permiso=prm1_.IdPermiso WHERE this_0_.IdPerfilPermiso in (206558, 206559, 209393, 209394) and (this_0_.PerfilAutorizado = prf2_.IdPerfil and this_0_.TipologiaAutorizada = this_.Tipologia and prm1_.Controlador = 'Incidencias' and prm1_.Accion = 'Index'))
ORDER BY this_.Fecha desc
执行时间:28506ms.
执行计划:http://s000.tinyupload.com/index.php?file_id=72827687005228029776
查询NOT IN值为0
SELECT distinct top 15 this_.IdIncidencia as y0_, this_.Fecha as y1_
FROM Incidencias this_ inner join Usuarios usuario1_ on this_.Usuario=usuario1_.IdUsuario inner join Usuarios_Perfiles perfiles5_ on usuario1_.IdUsuario=perfiles5_.Usuario and (perfiles5_.perfil in (select perfiles.idperfil from perfiles where perfiles.borrado = 0)) inner join Perfiles prf2_ on perfiles5_.Perfil=prf2_.IdPerfil
WHERE
this_.Instancia = 4 and
this_.EstadoWorkflow not in (SELECT this_0_.IdEstadoWorkflow as y0_ FROM EstadosWorkflows this_0_ WHERE this_0_.Final = 0) and
exists (SELECT this_0_.IdPerfilPermiso as y0_ FROM Perfiles_Permisos this_0_ inner join Permisos prm1_ on this_0_.Permiso=prm1_.IdPermiso WHERE this_0_.IdPerfilPermiso in (206558, 206559, 209393, 209394) and (this_0_.PerfilAutorizado = prf2_.IdPerfil and this_0_.TipologiaAutorizada = this_.Tipologia and prm1_.Controlador = 'Incidencias' and prm1_.Accion = 'Index'))
ORDER BY this_.Fecha desc
执行时间:498ms.
执行计划:http://s000.tinyupload.com/index.php?file_id=35554889075362686964
查询NOT IN值为1
SELECT distinct top 15 this_.IdIncidencia as y0_, this_.Fecha as y1_
FROM Incidencias this_ inner join Usuarios usuario1_ on this_.Usuario=usuario1_.IdUsuario inner join Usuarios_Perfiles perfiles5_ on usuario1_.IdUsuario=perfiles5_.Usuario and (perfiles5_.perfil in (select perfiles.idperfil from perfiles where perfiles.borrado = 0)) inner join Perfiles prf2_ on perfiles5_.Perfil=prf2_.IdPerfil
WHERE
this_.Instancia = 4 and
this_.EstadoWorkflow not in (SELECT this_0_.IdEstadoWorkflow as y0_ FROM EstadosWorkflows this_0_ WHERE this_0_.Final = 1) and
exists (SELECT this_0_.IdPerfilPermiso as y0_ FROM Perfiles_Permisos this_0_ inner join Permisos prm1_ on this_0_.Permiso=prm1_.IdPermiso WHERE this_0_.IdPerfilPermiso in (206558, 206559, 209393, 209394) and (this_0_.PerfilAutorizado = prf2_.IdPerfil and this_0_.TipologiaAutorizada = this_.Tipologia and prm1_.Controlador = 'Incidencias' and prm1_.Accion = 'Index'))
ORDER BY this_.Fecha desc
执行时间:386ms.
执行计划:http://s000.tinyupload.com/index.php?file_id=11500314236594795220
最佳答案 导致该问题的原因是SQL Server无法知道在进行优化时将为in -statement返回的确切值,因此无法使用统计信息.
当您在in子句中具有确切的值时,可以将它们与统计信息进行比较,并且SQL Server很可能非常准确地估计将有多少行,然后可以选择执行的最佳计划.
我自己没有尝试过,但你可以尝试为id创建一个过滤的统计信息,分别为值字段0和1,这可能会改善这种情况.
更新
从最新的图片可以清楚地看出估计是偏离的,行数估计为1,但在嵌套循环之后实际上是59851:
而这个错误的估计似乎会导致大量的表扫描,因为预计只会执行一次:
由于这是表扫描而不是聚簇索引扫描,因此看起来该表没有聚簇索引,也没有其他可以使用的索引.你能为此做些什么吗?不知道数据量,但是包含或正常列idperfil的borrado索引可能有所帮助.这也是在0值计划中发生的情况,但由于行数仅为605,因此605表扫描没有花费那么多时间,但是当你这样做几乎多100倍时,它开始需要时间.
看看not in-plan,然后搜索的结构完全不同,很可能是因为估计的行数更接近实际的行,SQL Server使用这种计划:
所以另一个解决方案可能是用Usuarios_Perfiles创建一个临时表(带有perfiles -limitation)可以提供帮助,因为它只有1179行.
没有统计IO输出,它不是100%确定花费时间的地方,但看起来很像是由表扫描引起的.