Alain D.
2011-Feb-10 15:23 UTC
[R] highest and second highest value in row for each combination
Dear R-List, I have a dataframe area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10)) type<-c(rep(1:10,5)) a<-rnorm(50) b<-rnorm(50) c<-rnorm(50) d<-rnorm(50) df<-cbind(area,type,a,b,c,d) df area type a b c d [1,] 1 1 0.45608192 0.240378547 2.05208079 -1.18827462 [2,] 1 2 -0.12119506 -0.028078577 -2.64323695 -0.83923441 [3,] 1 3 0.09066133 -1.134069619 1.53344812 -0.15670239 [4,] 1 4 -1.34505241 1.919941172 -1.02090099 0.75664358 [5,] 1 5 -0.29279617 -0.314955019 -0.88809266 2.22282022 [6,] 1 6 -0.59697893 -0.652937746 1.05132400 -0.02469151 [7,] 1 7 -1.18199400 0.728165962 -1.51419348 0.65640976 [8,] 1 8 -0.72925659 0.303514237 0.79758488 0.93444350 [9,] 1 9 -1.60080508 -0.187562633 0.51288428 -0.55692877 [10,] 1 10 0.54373268 -0.494994392 0.52902381 1.12938122 [11,] 2 1 -1.29675664 -0.644990784 -2.44067511 -0.18489544 [12,] 2 2 0.86330699 1.458038882 1.17514710 1.32896878 [13,] 2 3 0.30069402 1.361211939 0.84757211 1.14502761 ... Now I want to have for each combination of area and type the name and corresponding value of the two columns with the highest and second highest value a,b,c,d. In the above example it should be something like combination max colname 11 2.05 c 11 0.46 a 12 -0.03 b 12 -0.12 a ... (It might be arranged differently, though) Can anyone help? Thank you in advance! Alain [[alternative HTML version deleted]]
Phil Spector
2011-Feb-10 17:55 UTC
[R] highest and second highest value in row for each combination
Alain - Here's a reproducible data set: set.seed(19) area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10)) type<-c(rep(1:10,5)) a<-rnorm(50) b<-rnorm(50) c<-rnorm(50) d<-rnorm(50) df<-cbind(area,type,a,b,c,d) First I'll make a helper function to operate on one row of the data frame: get2 = function(x){ y = x[-c(1,2)] oy = order(y,decreasing=TRUE) nms = colnames(df)[-c(1,2)] data.frame(area=rep(x[1],2),type=rep(x[2],2), max=y[oy[1:2]],colname=nms[oy[1:2]]) } Now I can use apply, do.call and rbind to get the answer:> answer = do.call(rbind,apply(df,1,get2)) > head(answer)area type max colname b 1 1 1.7036697 b c 1 1 0.7910130 c c1 1 2 2.4576579 c a 1 2 0.3885812 a c2 1 3 1.2363598 c a1 1 3 -0.3443333 a (My numbers differ from yours because you didn't specify a seed for the random number generator) I'm not exactly sure how to form your column "combination", though. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Thu, 10 Feb 2011, Alain D. wrote:> Dear R-List, > > I have a dataframe > > area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10)) > type<-c(rep(1:10,5)) > a<-rnorm(50) > b<-rnorm(50) > c<-rnorm(50) > d<-rnorm(50) > df<-cbind(area,type,a,b,c,d) > > > df > area type a b > c d > [1,] 1 1 0.45608192 0.240378547 2.05208079 -1.18827462 > [2,] 1 2 -0.12119506 -0.028078577 -2.64323695 -0.83923441 > [3,] 1 3 0.09066133 -1.134069619 1.53344812 -0.15670239 > [4,] 1 4 -1.34505241 1.919941172 -1.02090099 0.75664358 > [5,] 1 5 -0.29279617 -0.314955019 -0.88809266 2.22282022 > [6,] 1 6 -0.59697893 -0.652937746 1.05132400 -0.02469151 > [7,] 1 7 -1.18199400 0.728165962 -1.51419348 0.65640976 > [8,] 1 8 -0.72925659 0.303514237 0.79758488 0.93444350 > [9,] 1 9 -1.60080508 -0.187562633 0.51288428 -0.55692877 > [10,] 1 10 0.54373268 -0.494994392 0.52902381 1.12938122 > [11,] 2 1 -1.29675664 -0.644990784 -2.44067511 -0.18489544 > [12,] 2 2 0.86330699 1.458038882 1.17514710 1.32896878 > [13,] 2 3 0.30069402 1.361211939 0.84757211 1.14502761 > ... > > Now I want to have for each combination of area and type the name and > corresponding value of the two columns with the highest and second highest > value a,b,c,d. > In the above example it should be something like > > combination max colname > 11 2.05 c > 11 0.46 a > 12 -0.03 b > 12 -0.12 a > ... > > (It might be arranged differently, though) > > Can anyone help? > > Thank you in advance! > > Alain > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >