为子查询的查询优化SQL“Where”子句

2023年3月17日 219次阅读

假设我有以下假设数据结构：

create table "country"
(
  country_id integer,  
  country_name varchar(50),
  continent varchar(50),
  constraint country_pkey primary key (country_id)
);

create table "person"
(
  person_id integer,
  person_name varchar(100),
  country_id integer,
  constraint person_pkey primary key (person_id)
);

create table "event"
(
  event_id integer,
  event_desc varchar(100),
  country_id integer,
  constraint event_pkey primary key (event_id)
);

我想查询每个国家/地区的人数和事件数.我决定使用子查询.

select c.country_name, sum(sub1.person_count) as person_count, sum(sub2.event_count) as event_count
from
  "country" c
  left join (select country_id, count(*) as person_count from "person" group by country_id) sub1
    on (c.country_id=sub1.country_id)
  left join (select country_id, count(*) as event_count from "event" group by country_id) sub2
    on (c.country_id=sub2.country_id)
group by c.country_name

我知道你可以通过在字段列表中使用select语句来实现这一点,但使用子查询的优点是我更灵活地更改SQL以使其汇总并使用另一个字段.假设我改变查询以便按大陆显示它,就像将字段“c.country_name”替换为“c.continent”一样简单.

我的问题是关于过滤.如果我们像这样添加一个where子句：

select c.country_name, 
  sum(sub1.person_count) as person_count, 
  sum(sub2.event_count) as event_count
from
  "country" c
  left join (select country_id, count(*) as person_count from "person" group by country_id) sub1
    on (c.country_id=sub1.country_id)
  left join (select country_id, count(*) as event_count from "event" group by country_id) sub2
    on (c.country_id=sub2.country_id)
where c.country_name='UNITED STATES'
group by c.country_name

子查询似乎仍然执行所有国家的计数.假设person和event表很大,并且我已经在所有表的country_id上有索引.这真的很慢.数据库不应只执行已过滤的国家/地区的子查询吗？我是否必须为每个子查询重新创建国家过滤器(这非常繁琐且代码不易修改)？我顺便使用PostgreSQL 8.3和9.0,但我猜其他数据库也是如此.

最佳答案

Shouldn’t the database only execute the subqueries for the country
that was filtered?

不是.像你这样的查询的第一步似乎是从FROM子句中的所有表构造函数构建一个工作表.之后评估WHERE子句.

想象一下,如果sub1和sub2都是基表而不是子选择,你将如何做到这一点.它们都有两列,每个country_id都有一行.如果你想加入所有行,你就这样写.

from
  "country" c
  left join sub1 on (c.country_id=sub1.country_id)
  left join sub2 on (c.country_id=sub2.country_id)

但是如果你想在一行上加入,你就会写一些与之相当的东西.

from
  "country" c
  left join (select * from sub1 where country_id = ?)
    on (c.country_id=sub1.country_id)
  left join (select * from sub2 where country_id = ?)
    on (c.country_id=sub2.country_id)

帮助开发早期SQL标准的Joe Celko经常在Usenet上写了大约how SQL’s order of evaluation appears.