我试图根据他们本可以做的活动来计算客户流失,反对按日期流失是正常的事情.我们有连接到特定主机的事件,在我的示例中,所有事件都由Alice托管,但它可能是不同的主机.
所有关注特定事件的人都应该被归入一个类别(新的,活跃的,搅动的和复活的).
新:第一次有人关注特定主持人的活动.
活动:再次关注(也跟踪特定主机的最后一个事件).
被搅动:追随者有机会效仿,但没有.
复活:已经开始关注的追随者已经开始关注之前关注的主持人.
declare @events table (event varchar(50), host varchar(50), date date)
declare @eventFollows table (event varchar(50), follower varchar(50))
insert into @events values ('e_1', 'Alice', GETDATE())
insert into @events values ('e_2', 'Alice', GETDATE())
insert into @events values ('e_3', 'Alice', GETDATE())
insert into @events values ('e_4', 'Alice', GETDATE())
insert into @events values ('e_5', 'Alice', GETDATE())
insert into @eventFollows values ('e_1', 'Bob') --new
insert into @eventFollows values ('e_2', 'Bob') --active
--Bob churned
insert into @eventFollows values ('e_4', 'Megan') --new
insert into @eventFollows values ('e_5', 'Bob') --resurrected
insert into @eventFollows values ('e_5', 'Megan') --active
select * from @events
select * from @eventFollows
预期的结果应该是这样的
select 'e_1', 1 as New, 0 as resurrected, 0 as active, 0 as churned --First time Bob follows Alice event
union all
select 'e_2', 0 as New, 0 as resurrected, 1 as active, 0 as churned --Bob follows the next event that Alice host (considered as Active)
union all
select 'e_3', 0 as New, 0 as resurrected, 0 as active, 1 as churned --Bob churns since he does not follow the next event
union all
select 'e_4', 1 as New, 0 as resurrected, 0 as active, 0 as churned --First time Megan follows Alice event
union all
select 'e_5', 0 as New, 1 as resurrected, 1 as active, 0 as churned --Second time (active) for Megan and Bob is resurrected
我开始查询类似下面的内容,但问题是我没有得到关注者没有遵循的所有事件(但可能已经遵循).
select a.event, follower, date,
LAG (a.event,1) over (partition by a.host, ma.follower order by date) as lag,
LEAD (a.event,1) over (partition by a.host, ma.follower order by date) as lead,
LAG (a.event,1) over (partition by a.host order by date) as lagP,
LEAD (a.event,1) over (partition by a.host order by date) as leadP
from @events a left join @eventFollows ma on ma.event = a.event order by host, follower, date
有任何想法吗?
最佳答案 这似乎是一种间接方法,但可以通过检查数字中的间隙来检测岛屿:
;with nrsE as
(
select *, ROW_NUMBER() over (order by event) rnrE from @events
), nrs as
(
select f.*,host, rnrE, ROW_NUMBER() over (partition by f.follower, e.host order by f.event ) rnrF
from nrsE e
join @eventFollows f on f.event = e.event
), f as
(
select host, follower, min(rnrE) FirstE, max(rnrE) LastE, ROW_NUMBER() over (partition by follower, host order by rnrE - rnrF) SeqNr
from nrs
group by host, follower, rnrE - rnrF --difference between rnr-Event and rnr-Follower to detect gaps
), stat as --from the result above on there are several options. this example uses getting a 'status' and pivoting on it
(
select e.event, e.host, case when f.FirstE is null then 'No participants' when f.LastE = e.rnrE - 1 then 'Churned' when rnrE = f.FirstE then case when SeqNr = 1 then 'New' else 'Resurrected' end else 'Active' end Status
from nrsE e
left join f on e.rnrE between f.FirstE and f.LastE + 1 and e.host = f.host
)
select p.* from stat pivot(count(Status) for Status in ([New], [Resurrected], [Active], [Churned])) p
最后两个步骤可以简化,但以这种方式获取“状态”可能可以重用于其他方案