natalie.vanzuydam
2011-Oct-05 15:53 UTC
[R] Subsetting a data frame with multiple values and exclusions.
Hi all, I realise that the convention is to provide a working example of my problem but the data are of a sensitive nature so I'm not able to do that in this case. I need to query a database for multiple search terms: db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1, 2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28, 9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class "data.frame", row.names = c(NA, -4L)) terms_include <- c("1","2","3") terms_exclude <- c("1.1","1.2","1.3") So I need to write a loop where the search of each value in the list of terms_include is searched over the entire data frame. I thought of using apply with grepl and subset? At the same time if the value of terms_include occurs in the same row as values from terms_exclude then that row must be excluded from the output dataframe. I'm not sure where to even begin. I've only worked very basically with subset. The final database is much larger and the number of search terms is many more than are presented here so I would really need to be able to loop over the data frame successively to return a final df with my searched values in at least one of the columns. Your help and assistance is much appreciated, Natalie ----- Natalie Van Zuydam PhD Student University of Dundee nvanzuydam at dundee.ac.uk -- View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3874967.html Sent from the R help mailing list archive at Nabble.com.
Dennis Murphy
2011-Oct-05 19:49 UTC
[R] Subsetting a data frame with multiple values and exclusions.
Hi: Is this what you're after? f <- function(x) !any(x %in% terms_exclude) && any(x %in% terms_include) db[apply(db[, -1], 1, f), ] ind test1 test2 test3 2 ind2 2 27 28.0 4 ind4 3 2 1.2 HTH, Dennis On Wed, Oct 5, 2011 at 8:53 AM, natalie.vanzuydam <nvanzuydam at gmail.com> wrote:> Hi all, > > I realise that the convention is to provide a working example of my problem > but the data are ?of a sensitive nature so I'm not able to do that in this > case. > > I need to query a database for multiple search terms: > > db <- structure(list(ind = c("ind1", "ind2", "ind3", "ind4"), test1 = c(1, > 2, 1.3, 3), test2 = c(56L, 27L, 58L, 2L), test3 = c(1.1, 28, > 9, 1.2)), .Names = c("ind", "test1", "test2", "test3"), class > "data.frame", row.names = c(NA, > -4L)) > > terms_include <- c("1","2","3") > terms_exclude <- c("1.1","1.2","1.3") > > So I need to write a loop where the search of each value in the list of > terms_include is searched over the entire data frame. ?I thought of using > apply with grepl and subset? ?At the same time if the value of terms_include > occurs in the same row as values from terms_exclude then that row must be > excluded from the output dataframe. > > I'm not sure where to even begin. ?I've only worked very basically with > subset. ?The final database is much larger and the number of search terms is > many more than are presented here so I would really need to be able to loop > over the data frame successively to return a final df with my searched > values in at least one of the columns. > > Your help and assistance is much appreciated, > Natalie > > > > ----- > Natalie Van Zuydam > > PhD Student > University of Dundee > nvanzuydam at dundee.ac.uk > -- > View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3874967.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
natalie.vanzuydam
2011-Oct-06 08:55 UTC
[R] Subsetting a data frame with multiple values and exclusions.
Thanks. Such a short and sweet answer that does what it should. ----- Natalie Van Zuydam PhD Student University of Dundee nvanzuydam at dundee.ac.uk -- View this message in context: http://r.789695.n4.nabble.com/Subsetting-a-data-frame-with-multiple-values-and-exclusions-tp3874967p3877472.html Sent from the R help mailing list archive at Nabble.com.