# Hive窗口函数总结

``````CREATE TABLE lxy (cookieid INT, create_time STRING, pv INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';
SELECT * FROM lxy;
``````

# SUM(), MIN(),MAX(),AVG()等聚合函数

``````SELECT *,
SUM(a.pv) OVER (PARTITION BY cookieid ORDER BY create_time ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS pv1,
SUM(a.pv) OVER (PARTITION BY cookieid ORDER BY create_time ROWS BETWEEN 2 PRECEDING AND 1 FOLLOWING) AS pv2
FROM lxy AS a;
``````

# 新增加序号列NTILE, ROW_NUMBER(), RANK(), DENSE_RANK()

``````SELECT *,
NTILE(3) OVER (PARTITION BY cookid2 ORDER BY pv) AS n1,
ROW_NUMBER() OVER (PARTITION BY cookid2 ORDER BY pv) AS n2,
RANK() OVER (PARTITION BY cookid2 ORDER BY pv) AS n3,
DENSE_RANK() OVER (PARTITION BY cookid2 ORDER BY pv) AS n4
FROM lxy3;
``````

lx3

``````SELECT *,
LAG(pv, 2) OVER(PARTITION BY cookid2 ORDER BY log_date) AS lag1,
FIRST_VALUE() OVER(PARTITION BY cookid2 ORDER BY log_date) AS first_pv,
FIRST_VALUE() OVER(PARTITION BY cookid2 ORDER BY log_date) AS last_pv,
LAST_VALUE() OVER(PARTITION BY cookid2 ORDER BY log_date) AS current_last_pv
FROM lxy3;
``````

``````SELECT *,
FIRST_VALUE() OVER(PARTITION BY cookid2 ORDER BY pv DESC) AS first_pv
FROM lxy3;
``````

# GROUPING SET, CUBE, ROLL UP

``````CREATE EXTERNAL TABLE lxw1234 (
month STRING,
day STRING,
) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LOCATION '/user/chenlinlin2156233/lxy2/';
``````

``````SELECT * FROM lxw1234;
``````

GROUPING SET(key1, key2)相当于是对不同字段进行group操作以后，再进行union all的操作。

``````SELECT month,
day,
GROUPING__ID
FROM lxw1234
GROUP BY month, day
GROUPING SETS(month, day)
ORDER BY GROUPING__ID;
``````

1. GROUPING_ID是自动生成的，是进行了GROUPING_SET()的操作之后。
2. 下划线有两个
3. 需要先做GROUP BY操作再传入GROUPING SETS
等价于先group再union all的做法
``````SELECT month,NULL,COUNT(DISTINCT cookieid) AS uv,1 AS GROUPING__ID FROM lxw1234 GROUP BY month
UNION ALL
SELECT NULL,day,COUNT(DISTINCT cookieid) AS uv,2 AS GROUPING__ID FROM lxw1234 GROUP BY day
UNION ALL
SELECT month,day,COUNT(DISTINCT cookieid) AS uv,3 AS GROUPING__ID FROM lxw1234 GROUP BY month,day
``````

CUBE就是比以上的GROUPING SETS多了一个两列的整合，也就是笛卡尔乘积。

``````SELECT month,
day,
GROUPING__ID
FROM lxw1234
GROUP BY month, day
WITH CUBE
ORDER BY GROUPING__ID;
``````

``````SELECT month,
day,
GROUPING__ID
FROM lxw1234
GROUP BY month, day
WITH ROLLUP
ORDER BY GROUPING__ID;
``````

rollup返回的结果

``````SELECT month,
day,
GROUPING__ID
FROM lxw1234
GROUP BY day, month
WITH ROLLUP
ORDER BY GROUPING__ID;
``````

# 参考资源

原文作者：九日照林
原文地址: https://www.jianshu.com/p/9fda829b1ef1
本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。