矢量化这个循环的方法？将两个矩阵相乘,存储信息,多次执行此操作而不进行循环

2023年8月23日 465次阅读

假设(在这个例子中是小数字)我有一个数组

3 x 14 x 5

叫这个

set.seed(1)
dfarray=array(rnorm(5*3*14,0,1),dim=c(3,14,5))

我有一个与此相对应的矩阵

39 (which is 13*3) x 14

调用此矩阵：

dfmat = matrix(rnorm(13*3*14,0,1),39,14)
dfmat = cbind(dfmat,rep(1:3,13))
dfmat = dfmat[order(dfmat [,15]),]
colnames(dfmat)[15]='unit'

我想要做的是运行这个循环：

 costs = c(0.45, 2.11, 1.05, 1.44, 0.88, 2.30, 1.96, 1.76, 2.06, 1.54, 1.69,1.75,0)
    p = c(1,2,3,1,4,3,2,1,4,1,3,4,0)
    profit=numeric(0)
    for(i in 1:3){
            j=13
            beta = dfarray[i,,]
            Xt = dfmat [which(dfmat [,'unit']==i),1:14]    #this takes a set of 13, Xt is 13x14

            Xbeta = exp( Xt %*% beta )
            iota = c(rep(1, j))
            denom = iota%*%Xbeta
            Prob =  (Xbeta/ (iota%*%denom))
            Eprob = rowSums(Prob)/5  #the 5 coming from the last dim of array
            profit = c(profit,sum((p-costs)*Eprob))

        }


     sum(profit)

我想不出一种方法来通过调用来对循环遍历的部分进行矢量化

beta = dfarray[i,,]
Xt = dfmat [which(dfmat [,'unit']==i),]   #this takes a set of 13, Xt is 13x14

最佳答案为了使我在评论栏中的评论清楚,假设我们将dfmat作为矩阵列表.使用矩阵列表比使用一个大的命名矩阵几乎总是更容易.此外,如果您想完全向量化此处给出的解决方案,您可能希望使用Matrix包中的bdiag获取块对角矩阵,该矩阵作用于列表.

set.seed(1)
dfarray=array(rnorm(5*3*14,0,1),dim=c(3,14,5))
# dfmats as a list of matrices
dfmats <- lapply(1:3, function(i)matrix(rnorm(13*14), nrow=13))

iota的乘法是colSums或rowSums,因此我们可以像f一样简化操作.

f <- function(Xbeta) rowSums(Xbeta / matrix(colSums(Xbeta), nrow=nrow(Xbeta), ncol=ncol(Xbeta), byrow=T)) / ncol(Xbeta)
#profits is written as a function for benchmarking
#cost and p are ignored as they can be easily added back in.
profits <- function(){ 
    Xbetas <- lapply(seq_len(dim(dfarray)[1]), function(i) exp(dfmats[[i]] %*% dfarray[i,,]))
    Eprobs <- lapply(Xbetas, f)
    unlist(Eprobs)
}

而你的方法

profits1 <- function(){
    profit=numeric(0)
    for(i in 1:dim(dfarray)[1]){
        j=13
        beta = dfarray[i,,]
        Xt = dfmat [which(dfmat [,'unit']==i),1:14]    #this takes a set of 13, Xt is 13x14

        Xbeta = exp( Xt %*% beta )
        iota = c(rep(1, j))
        denom = iota%*%Xbeta
        deno <- colSums(Xbeta)
        s <- iota%*%denom
        Prob =  (Xbeta/ s)
        Eprob = rowSums(Prob)/dim(dfarray)[3]  #the 100 coming from the last dim of array
        profit = c(profit,Eprob)

    }
    return(profit)
}
dfmat <- do.call(rbind, dfmats)
dfmat <- cbind(dfmat,rep(1:3, each=13))
colnames(dfmat)[15]='unit'

检查它们是否给出相同的结果

all.equal(profits(), profits1())
[1] TRUE

基准

我在通过http://www.louisaslett.com/RStudio_AMI/访问的AWS EC2免费绑定实例上运行此操作.

dfarray=array(rnorm(100*10000*14,0,1),dim=c(10000,14,100))
dfmats <- lapply(1:10000, function(i)matrix(rnorm(13*14), nrow=13))

从你的初始构造中,你可以将dfmat转换为列表dfmats为dfmats< – lapply(1：3,function(i)dfmat [which(dfmat [,’unit’] == i),1：14])但是这个是一个非常昂贵的转换.从dfmats创建dfmat的成本相当低.

dfmat <- do.call(rbind, dfmats)
dfmat <- cbind(dfmat,rep(1:10000, each=13))
colnames(dfmat)[15]='unit'

注意使用列表的异常加速,以及可怕的名称查找成本的危险.

system.time(a1 <- profits1())
#   user  system elapsed 
#250.885   4.442 255.394 
system.time(a <- profits())
#   user  system elapsed 
#  2.717   0.429   3.167 
all.equal(a, a1)
#[1] TRUE

PS：我注意到你已经问了几个可能与这个问题有关的问题,都得到了回答.如果你分享你如何成功地使用它们,我会很高兴.