1. get_json_object
示例:获取json中的sale_price字段
get_json_object(detail_json,'$.sale_price')
2. sum(case when…then…else end)
示例:获取第7天的总销售额
sum(case when by_day=7 then pay_amt else 0 end)
3.count(case when…then…else end)
示例:获取第7天的下单用户数
count(distinct case when by_day=7 then user_id end) as day_7,
4.min(case when…then…else end)
示例:获取vip用户下的第1单
min(case when is_vip=1 then order_dt end) ,
5.row_number() over([partition by col1] order by col2)
示例:获取订单中每个订单是用户下的第几单
row_number() over (partition by user_id order by order_time asc) as order_cnt
除Row_number外还有rank,dense_rank
以下是语法:
rank() over([partition by col1] order by col2)
dense_rank() over([partition by col1] order by col2)
row_number() over([partition by col1] order by col2)
未完待续。。
row_number():按行计数
row_number() 图自:https://www.cnblogs.com/ianunspace/p/5057333.html
rank()同排名则跳过计数
rank() 图自:https://www.cnblogs.com/ianunspace/p/5057333.html
dense_rank()同排名则合并计数
dense_rank() 图自:https://www.cnblogs.com/ianunspace/p/5057333.html
6.lag(,) over([partition by col1] order by col2)
示例:4月1日-4月10日连续5天下单的人数
lag(order_dt,5) over(partition by user_id order by order_dt):找到按照user_id分组后间隔往前第5个日期。把order_dt跟它相减,如果是=5,说明正好连续下单5天,如果null,说明连续下单不满5天,如果>5,说明中间有间断无单的日期。
--连续5天下单
with
base_data as(
select
distinct
user_id,
order_dt
from order_tb
where order_dt between '20190401' and '20190410'
),
res1 as (
select
user_id,
order_dt,
datediff(order_dt,lag(order_dt,5) over(partition by user_id order by order_dt)) as diff
from base_data
)
select
count(distinct user_id) as num
from res1
where diff>=5