Hi, I am relatively new to R and I am trying to figure out how to select rows of my data based on a condition. For example in the dataset below multiple observers recorded data at the same point on the same date and the data was recorded on separate rows, but I only need one of the rows of data. Is it possible to use grep to search for rows that have matching PtName and Date and then select the row where ObsID is the smallest with and if statement (i.e., keep the rows with an * and remove the others)? PtName Visit Date Obs ObsID S2 1 6/8/2005 KB 3 * S2 1 6/8/2005 JB 5 S3 1 6/8/2005 KB 3 * S3 1 6/8/2005 JB 5 D1 1 6/12/2007 CD 11 D1 1 6/12/2007 TE 2 * D1 1 6/12/2007 MB 4 I've tried splitting the data and writing an if...else statement, but have not been successful. I greatly appreciate the help. Thanks Much Kathi -- View this message in context: http://r.789695.n4.nabble.com/Select-rows-based-on-condition-tp4562919p4562919.html Sent from the R help mailing list archive at Nabble.com.
Hello, Try df1 <- read.table(text=" PtName Visit Date Obs ObsID S2 1 6/8/2005 KB 3 S2 1 6/8/2005 JB 5 S3 1 6/8/2005 KB 3 S3 1 6/8/2005 JB 5 D1 1 6/12/2007 CD 11 D1 1 6/12/2007 TE 2 D1 1 6/12/2007 MB 4 ", header=TRUE) head(df1) df2 <- with(df1, sapply(split(df1, list(PtName, Date)), function(x) if(nrow(x)) x[which(x$ObsID =min(x$ObsID)), ])) df2 <- do.call(rbind, df2) rownames(df2) <- 1:nrow(df2) df2 Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Select-rows-based-on-condition-tp4562919p4562991.html Sent from the R help mailing list archive at Nabble.com.
Thanks for the reply! I tried your suggestion, but R stops responding. Perhaps it is due to the size (>6,000 rows) of the dataset I am trying to manipulate?? -- View this message in context: http://r.789695.n4.nabble.com/Select-rows-based-on-condition-tp4562919p4563031.html Sent from the R help mailing list archive at Nabble.com.
Thanks Again!! R was holding a large file in memory so there was not enough memory to execute the function. After reading in the file separately, your code worked perfectly. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/Select-rows-based-on-condition-tp4562919p4563346.html Sent from the R help mailing list archive at Nabble.com.