thr3ads.net - R help - [R] data frame select max group by like function [Mar 2010]

If this information is useful, please help other people find it:
Share via:

Tan, Richard

2010-Mar-09 18:35 UTC

[R] data frame select max group by like function

Hi, I have a data frame with 3 columns: ID, year and score.  How can I
select for each unique ID, the year that has the max score?  For
example, for data frame
 
ID, year, score
tom, 1995, 88
rick, 1994, 90
mary, 2000, 97
tom, 1998, 60
mary, 1998,100
 
I shall have
ID, year, score
tom, 1995, 88
rick, 1994, 90
mary, 1998,100
 
Thanks,
Richard

	[[alternative HTML version deleted]]

Henrique Dallazuanna

2010-Mar-09 18:58 UTC

head link

[R] data frame select max group by like function

Try this:

library(sqldf)
sqldf("SELECT ID, Year, MAX(score) FROM DF GROUP BY ID")

Or

do.call(rbind, lapply(split(DF <- DF[order(DF$score),], DF$ID), tail, 1))

On Tue, Mar 9, 2010 at 3:35 PM, Tan, Richard <RTan at panagora.com>
wrote:> Hi, I have a data frame with 3 columns: ID, year and score. ?How can I
> select for each unique ID, the year that has the max score? ?For
> example, for data frame
>
> ID, year, score
> tom, 1995, 88
> rick, 1994, 90
> mary, 2000, 97
> tom, 1998, 60
> mary, 1998,100
>
> I shall have
> ID, year, score
> tom, 1995, 88
> rick, 1994, 90
> mary, 1998,100
>
> Thanks,
> Richard
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Henrique Dallazuanna
Curitiba-Paran?-Brasil
25? 25' 40" S 49? 16' 22" O

Ista Zahn

2010-Mar-09 18:59 UTC

head link

[R] data frame select max group by like function

Hi Richard,
There are probably better ways, but here is one approach:

Dat <- read.table(textConnection("ID, year, score
tom, 1995, 88
rick, 1994, 90
mary, 2000, 97
tom, 1998, 60
mary, 1998,100"), header=TRUE, sep=",")

MaxScore <- aggregate(Dat$score, list(Dat$ID), max)
names(MaxScore) <- c("ID", "score")
MaxYear <- merge(Dat, MaxScore)

Best,
Ista


On Tue, Mar 9, 2010 at 1:35 PM, Tan, Richard <RTan at panagora.com>
wrote:> Hi, I have a data frame with 3 columns: ID, year and score. ?How can I
> select for each unique ID, the year that has the max score? ?For
> example, for data frame
>
> ID, year, score
> tom, 1995, 88
> rick, 1994, 90
> mary, 2000, 97
> tom, 1998, 60
> mary, 1998,100
>
> I shall have
> ID, year, score
> tom, 1995, 88
> rick, 1994, 90
> mary, 1998,100
>
> Thanks,
> Richard
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

Peter Ehlers

2010-Mar-09 19:22 UTC

head link

[R] data frame select max group by like function

I find ddply() in package plyr handy for this sort of thing:

  library(plyr)
  f <- function(x) x[which.max( x[["score"]] ), ]
     ## x will be a subset of Dat according to ID
  ddply(Dat, "ID", f)

  -Peter Ehlers

On 2010-03-09 11:59, Ista Zahn wrote:> Hi Richard,
> There are probably better ways, but here is one approach:
>
> Dat<- read.table(textConnection("ID, year, score
> tom, 1995, 88
> rick, 1994, 90
> mary, 2000, 97
> tom, 1998, 60
> mary, 1998,100"), header=TRUE, sep=",")
>
> MaxScore<- aggregate(Dat$score, list(Dat$ID), max)
> names(MaxScore)<- c("ID", "score")
> MaxYear<- merge(Dat, MaxScore)
>
> Best,
> Ista
>
>
> On Tue, Mar 9, 2010 at 1:35 PM, Tan, Richard<RTan at panagora.com> 
wrote:
>> Hi, I have a data frame with 3 columns: ID, year and score.  How can I
>> select for each unique ID, the year that has the max score?  For
>> example, for data frame
>>
>> ID, year, score
>> tom, 1995, 88
>> rick, 1994, 90
>> mary, 2000, 97
>> tom, 1998, 60
>> mary, 1998,100
>>
>> I shall have
>> ID, year, score
>> tom, 1995, 88
>> rick, 1994, 90
>> mary, 1998,100
>>
>> Thanks,
>> Richard
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
-- 
Peter Ehlers
University of Calgary

Phil Spector

2010-Mar-09 19:54 UTC

head link

[R] data frame select max group by like function

Yet another way to do this with base R:
> dat = read.csv(textConnection('ID, year, score+ tom, 1995, 88
+ rick, 1994, 90
+ mary, 2000, 97
+ tom, 1998, 60
+ mary, 1998,100'))> do.call(rbind,lapply(split(dat,dat$ID),function(x)x[which.max(x$score),]))        ID year score
mary mary 1998   100
rick rick 1994    90
tom   tom 1995    88

 					- Phil Spector
 					 Statistical Computing Facility
 					 Department of Statistics
 					 UC Berkeley
 					 spector at stat.berkeley.edu


On Tue, 9 Mar 2010, Tan, Richard wrote:
> Hi, I have a data frame with 3 columns: ID, year and score.  How can I
> select for each unique ID, the year that has the max score?  For
> example, for data frame
>
> ID, year, score
> tom, 1995, 88
> rick, 1994, 90
> mary, 2000, 97
> tom, 1998, 60
> mary, 1998,100
>
> I shall have
> ID, year, score
> tom, 1995, 88
> rick, 1994, 90
> mary, 1998,100
>
> Thanks,
> Richard
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Gabor Grothendieck

2010-Mar-09 20:18 UTC

head link

[R] data frame select max group by like function

Try this.  The aggregate call gives a data frame with the ID and max
score.  Then we merge that back with the original data frame so that
we pick up the year too:

merge(DF, aggregate(DF['score'], DF['ID'], max))

On Tue, Mar 9, 2010 at 1:35 PM, Tan, Richard <RTan at panagora.com>
wrote:> Hi, I have a data frame with 3 columns: ID, year and score. ?How can I
> select for each unique ID, the year that has the max score? ?For
> example, for data frame
>
> ID, year, score
> tom, 1995, 88
> rick, 1994, 90
> mary, 2000, 97
> tom, 1998, 60
> mary, 1998,100
>
> I shall have
> ID, year, score
> tom, 1995, 88
> rick, 1994, 90
> mary, 1998,100
>
> Thanks,
> Richard
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Mar 2010 - data frame select max group by like function

[R] data frame select max group by like function

[R] data frame select max group by like function

[R] data frame select max group by like function

[R] data frame select max group by like function

[R] data frame select max group by like function

[R] data frame select max group by like function

Possibly Parallel Threads