Hi, R-users, If I have a data frame like this:>x<-data.frame(g=c("g1","g2","g1","g1","g2"),v=c(1,7,3,2,8))g v 1 g1 1 2 g2 7 3 g1 3 4 g1 2 5 g2 8 It contains two groups, g1 and g2. Now for each group I want the max v:> aggregate(x$v,list(g=x$g),max)g x 1 g1 3 2 g2 8 Beautiful. But what if I want to keep the row index of (g1 3) and (g2 8) in the original x? So I want is:>do somethingg x 3 g1 3 5 g2 8 Of course it'd may make much more sense if the row indexes are some row names that I want to keep. Is there a simple way to do that? Thanks a lot! Z _________________________________________________________________ [[elided Hotmail spam]] [[alternative HTML version deleted]]
Rather than aggregate, use order and duplicated as in this post: https://stat.ethz.ch/pipermail/r-help/2008-September/173139.html On Wed, Sep 24, 2008 at 11:21 AM, zhihuali <lzhtom at hotmail.com> wrote:> > Hi, R-users, > > If I have a data frame like this: >>x<-data.frame(g=c("g1","g2","g1","g1","g2"),v=c(1,7,3,2,8)) > g v > 1 g1 1 > 2 g2 7 > 3 g1 3 > 4 g1 2 > 5 g2 8 > > > It contains two groups, g1 and g2. Now for each group I want the max v: > >> aggregate(x$v,list(g=x$g),max) > g x > 1 g1 3 > 2 g2 8 > > Beautiful. But what if I want to keep the row index of (g1 3) and (g2 8) in the original x? > So I want is: >>do something > g x > 3 g1 3 > 5 g2 8 > > Of course it'd may make much more sense if the row indexes are some row names that I want to keep. > > Is there a simple way to do that? > > Thanks a lot! > > Z > > > > > > _________________________________________________________________ > [[elided Hotmail spam]] > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Adaikalavan Ramasamy
2008-Sep-24 19:00 UTC
[R] keep the row indexes/names when do aggregate
Not the most elegant solution but here goes. df <- data.frame(g=c("g1","g2","g1","g1","g2"),v=c(1,7,3,2,8)) rownames.which.max <- function(m, col){ w <- which.max( m[ , col] ) return( rownames(m)[w] ) } df.split <- split(df, df$g) ws <- sapply( df.split, rownames.which.max, col="v" ) ws g1 g2 "3" "5" df[ws, ] g v 3 g1 3 5 g2 8 Regards, Adai zhihuali wrote:> Hi, R-users, > > If I have a data frame like this: >> x<-data.frame(g=c("g1","g2","g1","g1","g2"),v=c(1,7,3,2,8)) > g v > 1 g1 1 > 2 g2 7 > 3 g1 3 > 4 g1 2 > 5 g2 8 > > > It contains two groups, g1 and g2. Now for each group I want the max v: > >> aggregate(x$v,list(g=x$g),max) > g x > 1 g1 3 > 2 g2 8 > > Beautiful. But what if I want to keep the row index of (g1 3) and (g2 8) in the original x? > So I want is: >> do something > g x > 3 g1 3 > 5 g2 8 > > Of course it'd may make much more sense if the row indexes are some row names that I want to keep. > > Is there a simple way to do that? > > Thanks a lot! > > Z > > > > > > _________________________________________________________________ > [[elided Hotmail spam]] > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.