kborgmann
2012-Jul-25 17:10 UTC
[R] Select rows based on matching conditions and logical operators
Hi, I have a dataset in which I would like to select rows based on matching conditions and return the maximum value of a variable else return one row if duplicate counts exist. My dataset looks like this: PGID PTID Year Visit Count 6755 53121 2009 1 0 6755 53121 2009 2 0 6755 53121 2009 3 0 6755 53122 2008 1 0 6755 53122 2008 2 0 6755 53122 2008 3 1 6755 53122 2009 1 0 6755 53122 2009 2 1 6755 53122 2009 3 2 I would like to select rows if PTID and Year match and return the maximum count else return one row if counts are the same, such that I get this output PGID PTID Year Visit Count 6755 53121 2009 1 0 6755 53122 2008 3 1 6755 53122 2009 3 2 I tried the following code and the output is almost correct but duplicate values were included df2<-with(df, sapply(split(df, list(PTID, Year)), function(x) if (nrow(x)) x[which(x$Count==max(x$Count)),])) df<-do.call(rbind,df) rownames(df)<-1:nrow(df) Any suggestions? Thanks much for your responses! -- View this message in context: http://r.789695.n4.nabble.com/Select-rows-based-on-matching-conditions-and-logical-operators-tp4637809.html Sent from the R help mailing list archive at Nabble.com.
Rui Barradas
2012-Jul-25 17:23 UTC
[R] Select rows based on matching conditions and logical operators
Hello, Apart from the output order this does it. (I have changed 'df' to 'df1', 'df' is an R function, the F distribution density.) df1 <- read.table(text=" PGID PTID Year Visit Count 6755 53121 2009 1 0 6755 53121 2009 2 0 6755 53121 2009 3 0 6755 53122 2008 1 0 6755 53122 2008 2 0 6755 53122 2008 3 1 6755 53122 2009 1 0 6755 53122 2009 2 1 6755 53122 2009 3 2", header=TRUE) df2 <- with(df1, sapply(split(df1, list(PTID, Year)), function(x) if (nrow(x)) x[which.max(x$Count), ])) df2 <- do.call(rbind, df2) rownames(df2) <- 1:nrow(df2) df2 which.max(9, not which(). Hope this helps, Rui Barradas Em 25-07-2012 18:10, kborgmann escreveu:> Hi, > I have a dataset in which I would like to select rows based on matching > conditions and return the maximum value of a variable else return one row if > duplicate counts exist. My dataset looks like this: > PGID PTID Year Visit Count > 6755 53121 2009 1 0 > 6755 53121 2009 2 0 > 6755 53121 2009 3 0 > 6755 53122 2008 1 0 > 6755 53122 2008 2 0 > 6755 53122 2008 3 1 > 6755 53122 2009 1 0 > 6755 53122 2009 2 1 > 6755 53122 2009 3 2 > > I would like to select rows if PTID and Year match and return the maximum > count else return one row if counts are the same, such that I get this > output > PGID PTID Year Visit Count > 6755 53121 2009 1 0 > 6755 53122 2008 3 1 > 6755 53122 2009 3 2 > > I tried the following code and the output is almost correct but duplicate > values were included > df2<-with(df, sapply(split(df, list(PTID, Year)), > function(x) if (nrow(x)) x[which(x$Count==max(x$Count)),])) > df<-do.call(rbind,df) > rownames(df)<-1:nrow(df) > > Any suggestions? > Thanks much for your responses! > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Select-rows-based-on-matching-conditions-and-logical-operators-tp4637809.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
kborgmann
2012-Jul-25 17:40 UTC
[R] Select rows based on matching conditions and logical operators
Thanks! which.max did the trick -- View this message in context: http://r.789695.n4.nabble.com/Select-rows-based-on-matching-conditions-and-logical-operators-tp4637809p4637816.html Sent from the R help mailing list archive at Nabble.com.
arun
2012-Jul-25 17:41 UTC
[R] Select rows based on matching conditions and logical operators
Hi, Try this: dat1<-read.table(text=" PGID??? PTID??? Year??? Visit? Count 6755??? 53121??? 2009??? 1??? 0 6755??? 53121??? 2009??? 2??? 0 6755??? 53121??? 2009??? 3??? 0 6755??? 53122??? 2008??? 1??? 0 6755??? 53122??? 2008??? 2??? 0 6755??? 53122??? 2008??? 3??? 1 6755??? 53122??? 2009??? 1??? 0 6755??? 53122??? 2009??? 2??? 1 6755??? 53122??? 2009??? 3??? 2 ",sep="",header=TRUE) dat2<-lapply(split(dat1,dat1$Count),function(x) x[which.max(x$Count),]) ?do.call(rbind,dat2) ? PGID? PTID Year Visit Count 0 6755 53121 2009???? 1???? 0 1 6755 53122 2008???? 3???? 1 2 6755 53122 2009???? 3???? 2 A.K. ----- Original Message ----- From: kborgmann <borgmann at email.arizona.edu> To: r-help at r-project.org Cc: Sent: Wednesday, July 25, 2012 1:10 PM Subject: [R] Select rows based on matching conditions and logical operators Hi, I have a dataset in which I would like to select rows based on matching conditions and return the maximum value of a variable else return one row if duplicate counts exist.? My dataset looks like this: PGID??? PTID??? Year??? Visit? Count 6755??? 53121??? 2009??? 1??? 0 6755??? 53121??? 2009??? 2??? 0 6755??? 53121??? 2009??? 3??? 0 6755??? 53122??? 2008??? 1??? 0 6755??? 53122??? 2008??? 2??? 0 6755??? 53122??? 2008??? 3??? 1 6755??? 53122??? 2009??? 1??? 0 6755??? 53122??? 2009??? 2??? 1 6755??? 53122??? 2009??? 3??? 2 I would like to select rows if PTID and Year match and return the maximum count else return one row if counts are the same, such that I get this output PGID??? PTID??? Year??? Visit? Count 6755??? 53121??? 2009??? 1??? 0 6755??? 53122??? 2008??? 3??? 1 6755??? 53122??? 2009??? 3??? 2 I tried the following code and the output is almost correct but duplicate values were included df2<-with(df, sapply(split(df, list(PTID, Year)), function(x) if (nrow(x)) x[which(x$Count==max(x$Count)),])) df<-do.call(rbind,df) rownames(df)<-1:nrow(df) Any suggestions? Thanks much for your responses! -- View this message in context: http://r.789695.n4.nabble.com/Select-rows-based-on-matching-conditions-and-logical-operators-tp4637809.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.