我想在两个字段上连接表,一个使用equi-join,另一个使用滚动连接.我正在使用的数据如下:
library(data.table)
dt <- data.table(Date = as.Date(c("2015-12-29", "2015-12-29", "2015-12-29", "2015-12-29", "2016-01-30", "2016-01 -30", "2016-01-30", "2016-01-30", "2016-02-29", "2016-02-29", "2016-02-29", "2016-02-29", "2016-03-26", "2016-03-26", "2016-03-26", "2016-03-26")),
ID = c("A", "B", "C", "D", "A", "B", "C", "D", "A", "B", "C", "D", "A", "B", "C", "D"),
Value = c("A201512", "B201512", "C201512", "D201512", "A201601", "B201601", "C201601", "D201601", "A201602", "B201602", "C201602", "D201602", "A201603", "B201603", "C201603", "D201603"), key = c('Date', 'ID'))
dtes <- data.table(Date=as.Date(c("2015-12-31", "2016-01-31", "2016-02-29", "2016-03-31")), key="Date")
dte <- CJ(Date=dtes$Date, ID=unique(dt$ID))
我想在ID(使用equi-join)AND Date(使用滚动连接)上加入表’dt’和’dte’
dt[dte, roll=T]
给我
# Date ID Value
# 1: 2015-12-31 A NA
# 2: 2015-12-31 B NA
# 3: 2015-12-31 C NA
# 4: 2015-12-31 D NA
# 5: 2016-01-31 A NA
# 6: 2016-01-31 B NA
# 7: 2016-01-31 C NA
# 8: 2016-01-31 D NA
# 9: 2016-02-29 A A201602
# 10: 2016-02-29 B B201602
# 11: 2016-02-29 C C201602
# 12: 2016-02-29 D D201602
# 13: 2016-03-31 A NA
# 14: 2016-03-31 B NA
# 15: 2016-03-31 C NA
# 16: 2016-03-31 D NA
我追求的结果是这样的:
# Date ID Value
# 2016-03-31 A A201603
# 2016-02-29 A A201602
# 2016-01-31 A A201601
# 2015-12-31 A A201512
# 2016-03-31 B B201603
# 2016-02-29 B B201602
# 2016-01-31 B B201601
# 2015-12-31 B B201512
# 2016-03-31 C C201603
# 2016-02-29 C C201602
# 2016-01-31 C C201601
# 2015-12-31 C C201512
# 2016-03-31 D D201603
# 2016-02-29 D D201602
# 2016-01-31 D D201601
# 2015-12-31 D D201512
这在data.table中是否可行?
最佳答案 是的,按相反顺序设置键;滚动进入合并的最后一列:
setkey(dt, ID, Date)
setkey(dte, ID, Date)
dt[dte, roll=TRUE][order(ID, -Date)]
Date ID Value
1: 2016-03-31 A A201603
2: 2016-02-29 A A201602
3: 2016-01-31 A A201601
4: 2015-12-31 A A201512
5: 2016-03-31 B B201603
6: 2016-02-29 B B201602
7: 2016-01-31 B B201512
8: 2015-12-31 B B201512
9: 2016-03-31 C C201603
10: 2016-02-29 C C201602
11: 2016-01-31 C C201601
12: 2015-12-31 C C201512
13: 2016-03-31 D D201603
14: 2016-02-29 D D201602
15: 2016-01-31 D D201601
16: 2015-12-31 D D201512
或者,而不是使用setkey,只需使用X [Y,on = cols,roll = TRUE]以正确的顺序写入cols(假设上面评论中提到的bug是固定的).