601. 体育馆的人流量
X 市建了一个新的体育馆,每日人流量信息被记录在这三列信息中:序号 (id)、日期 (date)、 人流量 (people)。
请编写一个查询语句,找出高峰期时段,要求连续三天及以上,并且每天人流量均不少于100。
例如,表 stadium
:
+------+------------+-----------+
| id | date | people |
+------+------------+-----------+
| 1 | 2017-01-01 | 10 |
| 2 | 2017-01-02 | 109 |
| 3 | 2017-01-03 | 150 |
| 4 | 2017-01-04 | 99 |
| 5 | 2017-01-05 | 145 |
| 6 | 2017-01-06 | 1455 |
| 7 | 2017-01-07 | 199 |
| 8 | 2017-01-08 | 188 |
+------+------------+-----------+
对于上面的示例数据,输出为:
+------+------------+-----------+
| id | date | people |
+------+------------+-----------+
| 5 | 2017-01-05 | 145 |
| 6 | 2017-01-06 | 1455 |
| 7 | 2017-01-07 | 199 |
| 8 | 2017-01-08 | 188 |
+------+------------+-----------+
Note:
每天只有一行记录,日期随着 id 的增加而增加。
思路/笔记:自连接的应用,这个语句其实有点取巧,因为日期都是连续的,而Id属于INT类型,比TO_DATE(date, ‘yyyymmdd’)/TO_CHAR(DATE, ‘yyyymmdd’)这样的运算快多了。
select distinct t1.*
from stadium t1, stadium t2, stadium t3
where t1.people >= 100 and t2.people >= 100 and t3.people >= 100
and ((t1.id - t2.id = 1 and t1.id - t3.id = 2 and t2.id - t3.id =1)
or (t2.id - t1.id = 1 and t2.id - t3.id = 2 and t1.id - t3.id =1)
or (t3.id - t2.id = 1 and t2.id - t1.id =1 and t3.id - t1.id = 2)
)
order by t1.id
262. 行程和用户
Trips
表中存所有出租车的行程信息。每段行程有唯一健 Id,Client_Id 和 Driver_Id 是 Users
表中 Users_Id 的外键。Status 是枚举类型,枚举成员为 (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’)。
+----+-----------+-----------+---------+--------------------+----------+
| Id | Client_Id | Driver_Id | City_Id | Status |Request_at|
+----+-----------+-----------+---------+--------------------+----------+
| 1 | 1 | 10 | 1 | completed |2013-10-01|
| 2 | 2 | 11 | 1 | cancelled_by_driver|2013-10-01|
| 3 | 3 | 12 | 6 | completed |2013-10-01|
| 4 | 4 | 13 | 6 | cancelled_by_client|2013-10-01|
| 5 | 1 | 10 | 1 | completed |2013-10-02|
| 6 | 2 | 11 | 6 | completed |2013-10-02|
| 7 | 3 | 12 | 6 | completed |2013-10-02|
| 8 | 2 | 12 | 12 | completed |2013-10-03|
| 9 | 3 | 10 | 12 | completed |2013-10-03|
| 10 | 4 | 13 | 12 | cancelled_by_driver|2013-10-03|
+----+-----------+-----------+---------+--------------------+----------+
Users
表存所有用户。每个用户有唯一键 Users_Id。Banned 表示这个用户是否被禁止,Role 则是一个表示(‘client’, ‘driver’, ‘partner’)的枚举类型。
+----------+--------+--------+
| Users_Id | Banned | Role |
+----------+--------+--------+
| 1 | No | client |
| 2 | Yes | client |
| 3 | No | client |
| 4 | No | client |
| 10 | No | driver |
| 11 | No | driver |
| 12 | No | driver |
| 13 | No | driver |
+----------+--------+--------+
写一段 SQL 语句查出 2013年10月1日 至 2013年10月3日 期间非禁止用户的取消率。基于上表,你的 SQL 语句应返回如下结果,取消率(Cancellation Rate)保留两位小数。
+------------+-------------------+
| Day | Cancellation Rate |
+------------+-------------------+
| 2013-10-01 | 0.33 |
| 2013-10-02 | 0.00 |
| 2013-10-03 | 0.50 |
+------------+-------------------+
思路/笔记:常用操作,手动滑稽
select t.Request_at AS "Day",round(sum(case when t.Status = 'completed' then 0
else 1
end) / count(0),2) AS "Cancellation Rate"
from trips t
where EXISTS (select u.Users_Id
from users u
where u.Users_Id = t.Client_Id
and u.Banned = 'No')
and t.Request_at between '2013-10-01' and '2013-10-03'
group by Day
185. 部门工资前三高的员工
Employee
表包含所有员工信息,每个员工有其对应的 Id, salary 和 department Id 。
+----+-------+--------+--------------+
| Id | Name | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1 | Joe | 70000 | 1 |
| 2 | Henry | 80000 | 2 |
| 3 | Sam | 60000 | 2 |
| 4 | Max | 90000 | 1 |
| 5 | Janet | 69000 | 1 |
| 6 | Randy | 85000 | 1 |
+----+-------+--------+--------------+
Department
表包含公司所有部门的信息。
+----+----------+
| Id | Name |
+----+----------+
| 1 | IT |
| 2 | Sales |
+----+----------+
编写一个 SQL 查询,找出每个部门工资前三高的员工。例如,根据上述给定的表格,查询结果应返回:
+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT | Max | 90000 |
| IT | Randy | 85000 |
| IT | Joe | 70000 |
| Sales | Henry | 80000 |
| Sales | Sam | 60000 |
+------------+----------+--------+
思路/笔记:
- 通过自连接排定次序
- join部门表
select d.Name Department, t.Name Employee, Salary
from (select Name,Salary,DepartmentId,
(select count(distinct e1.Salary)
from Employee e1
where e1.DepartmentId = e2.DepartmentId
and e1.Salary > e2.Salary) rank
from Employee e2)t
left join Department d on d.Id = t.DepartmentId
where rank < 3
and d.Id is not null
order by DepartmentId,Salary desc
总结
数据库操作的效率差别,需要在百万/千万/亿/十亿级别,才会有明显的差异,比如BAT的数据级别,只能自行研发数据库。所以这里只涉及到了很多基础的操作,比如case的用法,自连接的用法等。具体在项目常用到的操作有根据范式来设计表、根据常用到的字段来建立索引、考虑根据数据量按日/周/月分区、如何通过存储过程/函数高效转移数据、如何设置UNDO表空间保证数据库性能、怎样查看常用视图与统计表、如何备份、还原与清理数据…都还是需要在实际项目里去学习。