使用R中的匹配矩阵数据对提取数据

2023年9月22日 229次阅读

我有两个数据集,包括纬度,经度和温度数据.一个数据集对应于感兴趣的地理区域,其对应的纬度/长度对形成区域的边界和内容(矩阵尺寸= 4518×2)

另一个数据集包含包含感兴趣区域的较大区域的纬度/经度和温度数据(Matrix Dimenion = 10875×3).

我的问题是：如何从匹配第一个数据集的纬度/经度数据的第二个数据集中提取适当的行数据(lat,long,temperature)？

我尝试了各种“for循环”,“子集”和“唯一”命令,但我无法获得匹配的温度数据.

提前致谢！

10/31编辑：我忘了提到我正在使用“R”来处理这些数据.

感兴趣区域的纬度/经度数据是作为4,518个文件的列表提供的,其中包含每个文件名称中的纬度/经度坐标：

x<- dir()

lenx<- length(x)

g <- strsplit(x, "_")

coord1 <- matrix(NA,nrow=lenx, ncol=1)  
coord2 <- matrix(NA,nrow=lenx, ncol=1)

for(i in 1:lenx) {  
coord1[i,1] <- unlist(g)[2+3*(i-1)]  
coord2[i,1] <- unlist(g)[3+3*(i-1)]     
} 

coord1<-as.numeric(coord1)  
coord2<-as.numeric(coord2)

coord<- cbind(coord1, coord2)

纬度/长度和温度数据来自NCDF文件,其中包含10,875纬度/长对的温度数据：

long<- tempcd$var[["Temp"]]$size[1]   
lat<- tempcd$var[["Temp"]]$size[2]   
time<- tempcd$var[["Temp"]]$size[3]  
proj<- tempcd$var[["Temp"]]$size[4]  

temp<- matrix(NA, nrow=lat*long, ncol = time)  
lat_c<- matrix(NA, nrow=lat*long, ncol=1)  
long_c<- matrix(NA, nrow=lat*long, ncol =1)  

counter<- 1  

for(i in 1:lat){  
    for(j in 1:long){  
        temp[counter,]<-get.var.ncdf(precipcd, varid= "Prcp", count = c(1,1,time,1), start=c(j,i,1,1))  
        counter<- counter+1  
    }  
}  

temp_gcm <- cbind(lat_c, long_c, temp)`

所以现在的问题是如何从“co_？”中删除与“/ cod？”中的lat / long数据对相对应的“temp_gcm”中的值？

最佳答案诺埃,

我可以想到你可以通过多种方式做到这一点.最简单但最有效的方法是利用R的which()函数,它接受逻辑参数,同时迭代要应用匹配的数据帧.当然,这假设在较大的数据集中最多只能有一个匹配.根据您的数据集,我会做这样的事情：

attach(temp_gcm)    # adds the temp_gcm column names to the global namespace
attach(coord)    # adds the coord column names to the global namespace

matched.temp = vector(length = nrow(coord)) # To store matching results
for (i in seq(coord)) {

   matched.temp[i] = temp[which(lat_c == coord1[i] & long_c == coord2[i])]
}

# Now add the results column to the coord data frame (indexes match)
coord$temperature = matched.temp

函数(lat_c == coord1 [i]& long_c == coord2 [i])返回数据帧temp_gcm中所有行的向量,它满足lat_c和long_c分别匹配迭代中第i行的coord1和coord2(注意：我假设这个向量只有长度1,即只有1个可能的匹配).然后,matches.temp [i]将被赋值数据帧temp_gcm中的temp列中的值,该值满足逻辑条件.请注意,这样做的目的是创建一个向量,该向量具有与索引对应的数据帧坐标行的匹配值.

我希望这有帮助.请注意,这是一个基本的方法,我建议查找函数merge()以及apply()以更简洁的方式执行此操作.