Alain D.
2011-Feb-10 15:23 UTC
[R] highest and second highest value in row for each combination
Dear R-List,
I have a dataframe
area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10))
type<-c(rep(1:10,5))
a<-rnorm(50)
b<-rnorm(50)
c<-rnorm(50)
d<-rnorm(50)
df<-cbind(area,type,a,b,c,d)
df
area type a b
c d
[1,] 1 1 0.45608192 0.240378547 2.05208079 -1.18827462
[2,] 1 2 -0.12119506 -0.028078577 -2.64323695 -0.83923441
[3,] 1 3 0.09066133 -1.134069619 1.53344812 -0.15670239
[4,] 1 4 -1.34505241 1.919941172 -1.02090099 0.75664358
[5,] 1 5 -0.29279617 -0.314955019 -0.88809266 2.22282022
[6,] 1 6 -0.59697893 -0.652937746 1.05132400 -0.02469151
[7,] 1 7 -1.18199400 0.728165962 -1.51419348 0.65640976
[8,] 1 8 -0.72925659 0.303514237 0.79758488 0.93444350
[9,] 1 9 -1.60080508 -0.187562633 0.51288428 -0.55692877
[10,] 1 10 0.54373268 -0.494994392 0.52902381 1.12938122
[11,] 2 1 -1.29675664 -0.644990784 -2.44067511 -0.18489544
[12,] 2 2 0.86330699 1.458038882 1.17514710 1.32896878
[13,] 2 3 0.30069402 1.361211939 0.84757211 1.14502761
...
Now I want to have for each combination of area and type the name and
corresponding value of the two columns with the highest and second highest
value a,b,c,d.
In the above example it should be something like
combination max colname
11 2.05 c
11 0.46 a
12 -0.03 b
12 -0.12 a
...
(It might be arranged differently, though)
Can anyone help?
Thank you in advance!
Alain
[[alternative HTML version deleted]]
Phil Spector
2011-Feb-10 17:55 UTC
[R] highest and second highest value in row for each combination
Alain -
Here's a reproducible data set:
set.seed(19)
area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10))
type<-c(rep(1:10,5))
a<-rnorm(50)
b<-rnorm(50)
c<-rnorm(50)
d<-rnorm(50)
df<-cbind(area,type,a,b,c,d)
First I'll make a helper function to operate on one
row of the data frame:
get2 = function(x){
y = x[-c(1,2)]
oy = order(y,decreasing=TRUE)
nms = colnames(df)[-c(1,2)]
data.frame(area=rep(x[1],2),type=rep(x[2],2),
max=y[oy[1:2]],colname=nms[oy[1:2]])
}
Now I can use apply, do.call and rbind to get the answer:
> answer = do.call(rbind,apply(df,1,get2))
> head(answer)
area type max colname
b 1 1 1.7036697 b
c 1 1 0.7910130 c
c1 1 2 2.4576579 c
a 1 2 0.3885812 a
c2 1 3 1.2363598 c
a1 1 3 -0.3443333 a
(My numbers differ from yours because you didn't specify
a seed for the random number generator)
I'm not exactly sure how to form your column "combination",
though.
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Thu, 10 Feb 2011, Alain D. wrote:
> Dear R-List,
>
> I have a dataframe
>
> area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10))
> type<-c(rep(1:10,5))
> a<-rnorm(50)
> b<-rnorm(50)
> c<-rnorm(50)
> d<-rnorm(50)
> df<-cbind(area,type,a,b,c,d)
>
>
> df
> area type a b
> c d
> [1,] 1 1 0.45608192 0.240378547 2.05208079 -1.18827462
> [2,] 1 2 -0.12119506 -0.028078577 -2.64323695 -0.83923441
> [3,] 1 3 0.09066133 -1.134069619 1.53344812 -0.15670239
> [4,] 1 4 -1.34505241 1.919941172 -1.02090099 0.75664358
> [5,] 1 5 -0.29279617 -0.314955019 -0.88809266 2.22282022
> [6,] 1 6 -0.59697893 -0.652937746 1.05132400 -0.02469151
> [7,] 1 7 -1.18199400 0.728165962 -1.51419348 0.65640976
> [8,] 1 8 -0.72925659 0.303514237 0.79758488 0.93444350
> [9,] 1 9 -1.60080508 -0.187562633 0.51288428 -0.55692877
> [10,] 1 10 0.54373268 -0.494994392 0.52902381 1.12938122
> [11,] 2 1 -1.29675664 -0.644990784 -2.44067511 -0.18489544
> [12,] 2 2 0.86330699 1.458038882 1.17514710 1.32896878
> [13,] 2 3 0.30069402 1.361211939 0.84757211 1.14502761
> ...
>
> Now I want to have for each combination of area and type the name and
> corresponding value of the two columns with the highest and second highest
> value a,b,c,d.
> In the above example it should be something like
>
> combination max colname
> 11 2.05 c
> 11 0.46 a
> 12 -0.03 b
> 12 -0.12 a
> ...
>
> (It might be arranged differently, though)
>
> Can anyone help?
>
> Thank you in advance!
>
> Alain
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>