有限范围的查询可以通过多次 Join 实现,但如果不知道范围则需要引入 Recursion
树,图 结构类型的数据经常有 Recursion 的需要
Stanford CS145 PS1
SQL Recursion 语法
With Recursive
R As (base query
Union
recursive query )
<query involving R (and other tables)>
用 SQL 计算 factorial number
WITH RECURSIVE
factorials(n,x) AS (
SELECT 1, 1
UNION
SELECT n+1, (n+1)*x FROM factorials WHERE n < 5)
SELECT x FROM factorials WHERE n = 5;
上述 SQL 执行的流程是这样的:
- 先 base case, 然后将 (1, 1) 插入 factorials 表
- 从 factorials 中取出 (1, 1),计算 (1+1, (1+1)*1) = (2,2), 然后插入 factorials
- 不断循环,由于用的是 Union,会自动过滤掉重复的结构,因此每次只要从 factorials 取出最近插入的那个元素就行了
- 直到不满足 n < 5 退出
- 最后从 factorials 表中取出n为5时的x值 (此时factorials 中有的元素有 (1, 1), (2, 2), (3, 6), (4, 24), (5, 120))
可以发现 SQL 的 Recursion 与其他语言的不同
其他语言的 Recursion 都是 top-down 形式
而 SQL 的 Recursion 从 base case 开始不断 Union
给我的感觉更像动态规划。选择 Union 而不是 Union all 类似 动态规划中记录子问题
拿如下 Python 计算 factorial number 的例子进行比较。发现的确很像。。。
def factorial(n):
memo = [1] * (n+1)
i = 1
while i <= n:
memo[i] = memo[i-1] * i
i += 1
return memo[n]
最后放上 SQL Recursion 的一些例子
/**************************************************************
EXAMPLE 1: Ancestors
Find all of Mary's ancestors
**************************************************************/
create table ParentOf(parent text, child text);
insert into ParentOf values ('Alice', 'Carol');
insert into ParentOf values ('Bob', 'Carol');
insert into ParentOf values ('Carol', 'Dave');
insert into ParentOf values ('Carol', 'George');
insert into ParentOf values ('Dave', 'Mary');
insert into ParentOf values ('Eve', 'Mary');
insert into ParentOf values ('Mary', 'Frank');
with recursive
Ancestor(a,d) as (select parent as a, child as d from ParentOf
union
select Ancestor.a, ParentOf.child as d
from Ancestor, ParentOf
where Ancestor.d = ParentOf.parent)
select a from Ancestor where d = 'Mary';
/**************************************************************
EXAMPLE 2: Company hierarchy
Find total salary cost of project 'X'
**************************************************************/
create table Employee(ID int, salary int);
create table Manager(mID int, eID int);
create table Project(name text, mgrID int);
insert into Employee values (123, 100);
insert into Employee values (234, 90);
insert into Employee values (345, 80);
insert into Employee values (456, 70);
insert into Employee values (567, 60);
insert into Manager values (123, 234);
insert into Manager values (234, 345);
insert into Manager values (234, 456);
insert into Manager values (345, 567);
insert into Project values ('X', 123);
with recursive
Superior as (select * from Manager
union
select S.mID, M.eID
from Superior S, Manager M
where S.eID = M.mID )
select sum(salary)
from Employee
where ID in
(select mgrID from Project where name = 'X'
union
select eID from Project, Superior
where Project.name = 'X' AND Project.mgrID = Superior.mID );
/*** Alternative formulation tied specifically to project 'X' **/
with recursive
Xemps(ID) as (select mgrID as ID from Project where name = 'X'
union
select eID as ID
from Manager M, Xemps X
where M.mID = X.ID)
select sum(salary)
from Employee
where ID in (select ID from Xemps);