SQL查询语句编写规范

2023年1月15日 284次阅读来源: Jogging

目的
数据技术团队，每个成员经常要编写SQL来查询和统计数据，数据源有Hive和DB等，每个人编写习惯或多或少有一些差异，这些差异并不一定会影响结果，但是团队中的其他人阅读代码时会产生困扰，尤其是一些复杂的多表关联查询SQL，更是让很多代码阅读者困惑，因此有必要在团队内部制定SQL编写规范，在SQL代码书写上达成一致，提高可视化和良好的阅读体验。
编写规范
A、字段别名尽量不要使用拼音，要使用比较贴切的英文单词组合，做到文词达意；
B、字段别名尽量使用下划线分割多个单词；

例如：login_cnt（登录次数)、login_users(登录人数)等；

C、字段别名和子查询别名时不要省略as关键字;

例如: 
count(distinct user_id) as login_users
count(user_id) as login_cnt

D、SQL语句编写使用全小写字符，除非有个别大小写敏感的字段名称；
E、SQL语句中如遇from、where、on、left join、inner join、right join、group by、order by等关键字，单独占一行；

例如:
select user_id,user_name,gender 
from user
where reg_time>='2017-05-01' and reg_time<'2017-06-01'

F、避免在单独的SQL查询语句中写表别名;

反例：
select u.user_id,u.user_name,u.gender 
from user as u
where u.reg_time>='2017-05-01' and u.reg_time<'2017-06-01'
在这里给user 起别名为u，没有任何意义，徒增代码量和阅读障碍

G、时间范围作为查询条件时，请使用>=和<作为时间范围，不要使用>=和<=这样的写法；

例如:
查询5月份登录用户数
select count(distinct user_id) as login_users
from user_login
where login_time>='2017-05-01' and login_time<'2017-06-01'
反例:
select count(distinct user_id) as login_users
from user_login
where login_time>='2017-05-01' and login_time<='2017-05-31 23:59:59'
这样写法会因为login_time时间精度的不同导致漏掉一些数据；且书写时比较复杂；

H、子查询别名尽量使用it (inner table)、t1 (temp table 1) 和rs1(record result 1)命名；

例如:
select min_date,count(mobile) as mobiles
from
(
    select mobile,min(vdate) as min_date
    from
    (
        select vdate,mobile
        from
        (
            select vdate,split(mobile,',') as mobiles
            from s_jd_picked
        ) as it1
        lateral view outer explode(mobiles) explode_table1 as mobile
    ) as t1
    where length(mobile)=11
    group by mobile
) as rs
where min_date>='2017-04-01' and min_date<'2017-05-01'
group by min_date
order by min_date

I、SQL查询语句要有层次结构，使用4个空格缩进，在Hive中制表符不能作为缩进；
J、单行字符长度不要超过80个字符，如果超过请换行，提高代码阅读性；

下面提供一个较完整的例子:

例如:
select date(schedule_date) as schedule_date,
sum(cast(onBoardCount as float)) as first_billing_money, 
sum(cast(secondValue as float)) as second_billing_money
from
(
    select t1.project_id,t1.schedule_id,
    case when t2.schedule_date is null then created_time else schedule_date end as schedule_date,
    t1.mobile,interview_status
    from
    (
        select mobile as mobile,projectId as project_id,
        cast(projectScheduleId as int) as schedule_id,created_time,
        cast(interviewerStatus as int) as interview_status
        from mf_project_track
        where created_time>='2017-01-01' 
    ) as t1
    left join
    (
        select projectId as project_id,id as schedule_id,
        to_timestamp((executionDate) as schedule_date
        from mf_project_schedule
    ) as t2
    on t1.project_id=t2.project_id and t1.schedule_id=t2.schedule_id
) as rs1
inner join 
mf_project as rs2
on cast(rs1.project_id as int)=rs2.id
where schedule_date>='2017-04-01'
group by date(schedule_date)

    原文作者：Jogging
    原文地址: https://www.jianshu.com/p/43d8a7fa4de5
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。