As an R beginner, I feel brain dead today as I can not find the answer to a relatively simple question. Given a array of string values, for example lets say "mary", "bob", "danny", "sue", and "jane". I am trying to determine how to perform a logical test to determine if a variable is an exact match for one of the string values in the array when the number of strings in the array is variable and without using a for loop and without comparing each value. Considering the power of R, I thought this would be easy, but its not obvious to me. Now I may not yet be one with the R fu so a bit more context. I have a data frame that contains a column with text values. What I am trying to do is use the subset function on the data frame to select only data for "sue" or "jane" (for example.) But maybe I have not taken the correct approach? So obviously I could do something like the following. subset( data_frame, name = "sue" | name == "jane", select = c(name, age, birthdate)) However, my subset needs to be much more than 2 and being lazy I do not want to type "| name == "some text" for each one. Is there an other way? Neil
Henrique Dallazuanna
2009-Jul-02 14:20 UTC
[R] Testing for membership in an array of strings
Try this: c("mary", "sue") %in% c("mary", "bob", "danny", "sue","jane") On Thu, Jul 2, 2009 at 11:13 AM, Neil Tiffin <neilt@neiltiffin.com> wrote:> As an R beginner, I feel brain dead today as I can not find the answer to a > relatively simple question. > > Given a array of string values, for example lets say "mary", "bob", > "danny", "sue", and "jane". > > I am trying to determine how to perform a logical test to determine if a > variable is an exact match for one of the string values in the array when > the number of strings in the array is variable and without using a for loop > and without comparing each value. Considering the power of R, I thought > this would be easy, but its not obvious to me. > > Now I may not yet be one with the R fu so a bit more context. > > I have a data frame that contains a column with text values. What I am > trying to do is use the subset function on the data frame to select only > data for "sue" or "jane" (for example.) But maybe I have not taken the > correct approach? > > So obviously I could do something like the following. > > subset( data_frame, name = "sue" | name == "jane", select = c(name, age, > birthdate)) > > However, my subset needs to be much more than 2 and being lazy I do not > want to type "| name == "some text" for each one. > > Is there an other way? > > Neil > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
On Jul 2, 2009, at 9:13 AM, Neil Tiffin wrote:> As an R beginner, I feel brain dead today as I can not find the > answer to a relatively simple question. > > Given a array of string values, for example lets say "mary", "bob", > "danny", "sue", and "jane". > > I am trying to determine how to perform a logical test to determine > if a variable is an exact match for one of the string values in the > array when the number of strings in the array is variable and > without using a for loop and without comparing each value. > Considering the power of R, I thought this would be easy, but its > not obvious to me. > > Now I may not yet be one with the R fu so a bit more context. > > I have a data frame that contains a column with text values. What I > am trying to do is use the subset function on the data frame to > select only data for "sue" or "jane" (for example.) But maybe I > have not taken the correct approach? > > So obviously I could do something like the following. > > subset( data_frame, name = "sue" | name == "jane", select = c(name, > age, birthdate)) > > However, my subset needs to be much more than 2 and being lazy I do > not want to type "| name == "some text" for each one. > > Is there an other way? > > NeilTry this: subset(data_frame, name %in% c("sue", "jane"), select = c(name, age, birthdate)) or if your vector of names is: Names <- c("mary", "bob", "danny", "sue", "jane") subset(data_frame, name %in% Names, select = c(name, age, birthdate)) See ?"%in%" for more information. HTH, Marc Schwartz
On 2 July 2009 at 09:13, Neil Tiffin wrote: | Given a array of string values, for example lets say "mary", "bob", | "danny", "sue", and "jane". | | I am trying to determine how to perform a logical test to determine if | a variable is an exact match for one of the string values in the array | when the number of strings in the array is variable and without using | a for loop and without comparing each value. Considering the power of | R, I thought this would be easy, but its not obvious to me. | | Now I may not yet be one with the R fu so a bit more context. | | I have a data frame that contains a column with text values. What I am | trying to do is use the subset function on the data frame to select | only data for "sue" or "jane" (for example.) But maybe I have not | taken the correct approach? | | So obviously I could do something like the following. | | subset( data_frame, name = "sue" | name == "jane", select = c(name, | age, birthdate)) | | However, my subset needs to be much more than 2 and being lazy I do | not want to type "| name == "some text" for each one. | | Is there an other way? Yup, e.g. using the %in% operator:> set.seed(42) # fix rng so that you get the same data.frame > neil <- data.frame(rownb=1:20, name=sample(c("mary", "bob", "danny", "sue", > "jane"), 20, replace=TRUE)) > head(neil) # quick sanity checkrownb name 1 1 jane 2 2 jane 3 3 bob 4 4 jane 5 5 sue 6 6 danny> neil[ neil$name %in% c("sue", "jane"), ]rownb name 1 1 jane 2 2 jane 4 4 jane 5 5 sue 7 7 sue 9 9 sue 10 10 sue 12 12 sue 13 13 jane 16 16 jane 17 17 jane>Cheers, Dirk -- Three out of two people have difficulties with fractions.