top_n与r中的顺序

2023年2月13日 455次阅读

我无法理解dplyr的top_n函数的输出.有人可以帮忙吗？

n=10

df = data.frame(ref=sample(letters,n),score=rnorm(n))

require(dplyr)

print(dplyr::top_n(df,5,score))

print(df[order(df$score,decreasing = T)[1:5],])

top_n的输出不按照我的预期按照得分排序.与使用订单功能进行比较

 ref      score
1   i 0.71556494
2   p 0.04463846
3   v 0.37290990
4   g 1.53206194
5   f 0.86307107
   ref      score
7    g 1.53206194
10   f 0.86307107
1    i 0.71556494
6    v 0.37290990
4    p 0.04463846

我读过的文档还暗示top_n结果应该由指定的列排序,例如

https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf

最佳答案两个输出都相同,但top_n不重新排列行.

您可以使用arrange()获得与df [order(df $score,decrease = T)[1：5],]相同的结果

top_n(df, 5, score) %>% arrange(desc(score))

翻转顺序,df [order(df $score,decrease = F)[1：5],]相当于top_n(df,-5,score)％>％arrange(score).