对于使用字段GRP分组的每个组,我想检索A列中最常出现的值和B列中最常出现的值,并且可能对许多其他列执行此操作.
样本数据:
GRP | A | B
-----------
Cat | 1 | 1
Cat | 2 | 1
Cat | 3 | 2
Cat | 3 | 3
Dog | 5 | 6
Dog | 5 | 7
Dog | 6 | 7
预期产出:
GRP | A | B
-----------
Cat | 3 | 1
Dog | 5 | 7
此查询实现了该结果:
SELECT
freq1.GRP,
freq1.A,
freq2.B
FROM (
SELECT
GRP,
A,
ROW_NUMBER() OVER(PARTITION BY GRP ORDER BY COUNT(*) DESC) AS F_RANK
FROM MyTable
GROUP BY GRP, A
) AS freq1
INNER JOIN (
SELECT
GRP,
B,
ROW_NUMBER() OVER(PARTITION BY GRP ORDER BY COUNT(*) DESC) AS F_RANK
FROM MyTable
GROUP BY GRP, B
) AS freq2 ON freq2.GRP = freq1.GRP
WHERE freq1.F_RANK = 1 AND freq2.F_RANK = 1
它看起来效率不高,如果我要添加C,D等列,那就更不用了……
有没有更好的办法?
最佳答案 我不会说这种方法“更好”,因为它会产生完全相同的执行计划.但是,随着列数的增加,我发现这种方法更易于维护.对我来说,这更容易阅读.
with GroupA as
(
select Grp
, A
, ROW_NUMBER() over(partition by grp order by count(*) desc) as RowNum
from MyTable
group by Grp, A
)
, GroupB as
(
select Grp
, B
, ROW_NUMBER() over(partition by grp order by count(*) desc) as RowNum
from MyTable
group by Grp, B
)
select a.Grp
, a.A
, b.B
from GroupA a
inner join GroupB b on a.Grp = b.Grp and b.RowNum = 1
where a.RowNum = 1;